A font invariant character segmentation technique for printed bangla word images

2Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A solution for segmentation of Bangla word images, printed in different fonts with varying styles and sizes, into constituent characters is reported here. Firstly, three horizontally non-intersecting zones viz., Upper, Middle and Lower Zones of a given word are identified. Then, estimation of the probable black pixels, which constitute common Matra of the word, a prominent feature in Bangla script, is done. Some of the black pixels on the Matra region are selected as potential segmentation points to segment the word vertically into their constituent characters. Each of these segmented components is then categorized into any of the six possible component types (viz. upper/middle/lower zone component/ middle and lower zone component/ broken character component/noise component). Middle and lower zone components are separated horizontally. The methodology is tested on 1600 word images of different fonts with varying styles and sizes and average success rate achieved is 96.85%. © 2012 Springer-Verlag GmbH Berlin Heidelberg.

Cite

CITATION STYLE

APA

Sarkar, R., Malakar, S., Das, N., Basu, S., Kundu, M., & Nasipuri, M. (2012). A font invariant character segmentation technique for printed bangla word images. In Advances in Intelligent and Soft Computing (Vol. 132 AISC, pp. 739–746). Springer Verlag. https://doi.org/10.1007/978-3-642-27443-5_84

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free