Segmentation of Arabic text into characters for recognition

Noor Ahmed Shaikh; Zubair Ahmed Shaikh; Ghulam Ali

Conference Proceedings

Segmentation of Arabic text into characters for recognition

Communications in Computer and Information Science (2008) 20 CCIS 11-18

DOI: 10.1007/978-3-540-89853-5_4

2Citations

13Readers

Get full text

Abstract

One of the steps of character recognition systems is the segmentation of words/sub-words into characters. The segmentation of text written in any Arabic script is a most difficult task. Due to this difficulty, many systems consider sub-words instead of a character as the basic unit for recognition. We propose a method for the segmentation of printed Arabic words/sub-words into characters. In the proposed method, primary and secondary strokes of the sub-words are separated and then segmentation points are identified in the primary strokes. For this, we compute the vertical projection graph for each line, which is then processed to generate a string indicating relative variations in pixels. The string is scanned further to produce characters from the sub-words. In the proposed method we use Sindhi text for segmentation into characters as its character set is the super set of Arabic. This method can be used for any other Naskh-based Arabic script such as Persian, Pashto and Urdu. © 2008 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Shaikh, N. A., Shaikh, Z. A., & Ali, G. (2008). Segmentation of Arabic text into characters for recognition. In Communications in Computer and Information Science (Vol. 20 CCIS, pp. 11–18). Springer Verlag. https://doi.org/10.1007/978-3-540-89853-5_4

Segmentation of Arabic text into characters for recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions