Segmentation of Arabic text into characters for recognition

2Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

One of the steps of character recognition systems is the segmentation of words/sub-words into characters. The segmentation of text written in any Arabic script is a most difficult task. Due to this difficulty, many systems consider sub-words instead of a character as the basic unit for recognition. We propose a method for the segmentation of printed Arabic words/sub-words into characters. In the proposed method, primary and secondary strokes of the sub-words are separated and then segmentation points are identified in the primary strokes. For this, we compute the vertical projection graph for each line, which is then processed to generate a string indicating relative variations in pixels. The string is scanned further to produce characters from the sub-words. In the proposed method we use Sindhi text for segmentation into characters as its character set is the super set of Arabic. This method can be used for any other Naskh-based Arabic script such as Persian, Pashto and Urdu. © 2008 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Shaikh, N. A., Shaikh, Z. A., & Ali, G. (2008). Segmentation of Arabic text into characters for recognition. In Communications in Computer and Information Science (Vol. 20 CCIS, pp. 11–18). Springer Verlag. https://doi.org/10.1007/978-3-540-89853-5_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free