Page-to-word extraction from unconstrained handwritten document images

Pawan Kumar Singh; Sagnik Pal Chowdhury; Shubham Sinha; Sungmin Eum; Ram Sarkar

Conference Proceedings

Page-to-word extraction from unconstrained handwritten document images

Advances in Intelligent Systems and Computing (2017) 458 517-525

DOI: 10.1007/978-981-10-2035-3_53

6Citations

6Readers

Get full text

Abstract

Extraction of words directly from handwritten document images is still a challenging problem in the development of a complete Optical Character Recognition (OCR) system. In this paper, a robust word extraction scheme is reported. Firstly, applying Harris corner point detection algorithm, key points are generated from the document images which are then clustered using well-known DBSCAN technique. Finally, the boundary of the text words present in the document images are estimated based on the convex hull drawn for each of the clustered key points. The proposed technique is tested on randomly selected 50 images from CMATERdb1database and the success rate is found to be 90.48% which is equivalent to the state-of-the-art.

Author supplied keywords

Cite

CITATION STYLE

APA

Singh, P. K., Chowdhury, S. P., Sinha, S., Eum, S., & Sarkar, R. (2017). Page-to-word extraction from unconstrained handwritten document images. In Advances in Intelligent Systems and Computing (Vol. 458, pp. 517–525). Springer Verlag. https://doi.org/10.1007/978-981-10-2035-3_53

Page-to-word extraction from unconstrained handwritten document images

Abstract

Author supplied keywords

Cite

Register to see more suggestions