Learning to Clean: A GAN Perspective

16Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In the big data era, the impetus to digitize the vast reservoirs of data trapped in unstructured scanned documents such as invoices, bank documents, courier receipts and contracts has gained fresh momentum. The scanning process often results in the introduction of artifacts such as salt-and-pepper/background noise, blur due to camera motion or shake, watermarkings, coffee stains, wrinkles, or faded text. These artifacts pose many readability challenges to current text recognition algorithms and significantly degrade their performance. Existing learning based denoising techniques require a dataset comprising of noisy documents paired with cleaned versions of the same document. In such scenarios, a model can be trained to generate clean documents from noisy versions. However, very often in the real world such a paired dataset is not available, and all we have for training our denoising model are unpaired sets of noisy and clean images. This paper explores the use of Generative Adversarial Networks (GAN) to generate denoised versions of the noisy documents. In particular, where paired information is available, we formulate the problem as an image-to-image translation task i.e, translating a document from noisy domain (i.e., background noise, blurred, faded, watermarked) to a target clean document using Generative Adversarial Networks (GAN). However, in the absence of paired images for training, we employed CycleGAN which is known to learn a mapping between the distributions of the noisy images to the denoised images using unpaired data to achieve image-to-image translation for cleaning the noisy documents. We compare the performance of CycleGAN for document cleaning tasks using unpaired images with a Conditional GAN trained on paired data from the same dataset. Experiments were performed on a public document dataset on which different types of noise were artificially induced, results demonstrate that CycleGAN learns a more robust mapping from the space of noisy to clean documents.

Cite

CITATION STYLE

APA

Sharma, M., Verma, A., & Vig, L. (2019). Learning to Clean: A GAN Perspective. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11367 LNCS, pp. 174–185). Springer Verlag. https://doi.org/10.1007/978-3-030-21074-8_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free