A font invariant character segmentation technique for printed bangla word images

Ram Sarkar; Samir Malakar; Nibaran Das; Subhadip Basu; Mahantapas Kundu; Mita Nasipuri

Conference Proceedings

A font invariant character segmentation technique for printed bangla word images

Advances in Intelligent and Soft Computing (2012) 132 AISC 739-746

DOI: 10.1007/978-3-642-27443-5_84

2Citations

8Readers

Get full text

Abstract

A solution for segmentation of Bangla word images, printed in different fonts with varying styles and sizes, into constituent characters is reported here. Firstly, three horizontally non-intersecting zones viz., Upper, Middle and Lower Zones of a given word are identified. Then, estimation of the probable black pixels, which constitute common Matra of the word, a prominent feature in Bangla script, is done. Some of the black pixels on the Matra region are selected as potential segmentation points to segment the word vertically into their constituent characters. Each of these segmented components is then categorized into any of the six possible component types (viz. upper/middle/lower zone component/ middle and lower zone component/ broken character component/noise component). Middle and lower zone components are separated horizontally. The methodology is tested on 1600 word images of different fonts with varying styles and sizes and average success rate achieved is 96.85%. © 2012 Springer-Verlag GmbH Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Sarkar, R., Malakar, S., Das, N., Basu, S., Kundu, M., & Nasipuri, M. (2012). A font invariant character segmentation technique for printed bangla word images. In Advances in Intelligent and Soft Computing (Vol. 132 AISC, pp. 739–746). Springer Verlag. https://doi.org/10.1007/978-3-642-27443-5_84

A font invariant character segmentation technique for printed bangla word images

Abstract

Author supplied keywords

Cite

Register to see more suggestions