Offline Hand Written Urdu Word Spotting Using Random Data Generation

Faiq Faizan Farooqui; Muhammad Hassan; Muhammad Shahzad Younis; Muhammad Kashif Siddhu

Journal ArticleOPEN ACCESS

Offline Hand Written Urdu Word Spotting Using Random Data Generation

IEEE Access (2020) 8 131119-131136

DOI: 10.1109/ACCESS.2020.3010166

9Citations

20Readers

Abstract

Urdu word spotting is among the most challenging tasks in image processing and word spotting of hand written Urdu text is even more so. When it comes to handwritten Urdu documents, variation among the same words of various writers is significant. The orientation and style of the handwriting makes it really challenging for a word spotting system to correctly recognize the instances of the keyword. In this research, we tend to overcome this hurdle. We propose a system that takes a database of hand written Urdu text and generates random, yet, similar images to improve the classifier's ability to recognize variations caused by difference in handwriting. For image generation, we used geometric transformations and variants of Generative Adversarial Network (GAN). For the word spotting process, Histogram of Oriented Gradients (HOG) features are extracted from ligature images and then used to train a Long Short-Term Memory (LSTM) network for the classification task. This is the first study that focuses on improving word spotting by generating arbitrary samples using GANs and its variants. The system achieved a promising recognition rate of 98.96% due to the sample generation using Cycle-GANs.

Author supplied keywords

Cite

CITATION STYLE

APA

Farooqui, F. F., Hassan, M., Younis, M. S., & Siddhu, M. K. (2020). Offline Hand Written Urdu Word Spotting Using Random Data Generation. IEEE Access, 8, 131119–131136. https://doi.org/10.1109/ACCESS.2020.3010166

Offline Hand Written Urdu Word Spotting Using Random Data Generation

Abstract

Author supplied keywords

Cite

Register to see more suggestions