Automatic line-level script identification from handwritten document images - A region-wise classification framework for indian subcontinent

6Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

Script identification is a well-studied problem for automatic processing of document images. Several attempts have been made so far, but it is still far ahead from the complete solution. In this paper, an automatic approach for line-level handwritten script identification (HSI), considering eight official Indic scripts namely: Bangla, Devanagari, Kannada, Malayalam, Oriya, Roman, Telugu, and Urdu is proposed. We consider a 148-dimensional feature vector using: image component fractal dimension, structural and visual appearance, directional stroke, interpolation and Gabor energy based texture features. For classification, we divide the whole script dataset based on different regions of India, to study a region-wise classification performance. Experimentation was carried out using the state-of-the-art classifiers: multilayer perceptron (MLP), support vector machine (SVM), random forest (RF), and fuzzy unordered rule induction algorithm (FURIA). Among all, we found that MLP as the best performer in terms of average accuracy of 98.2%, 99.5%, 99.1%, 99.5%, 99.9%, 98%, 98.9% for eight-script, bi-script, eastern, north, south Indian script groups, scripts with 'matra' vs without 'matra', and dravidian vs. non-dravidian groups respectively.

Cite

CITATION STYLE

APA

Obaidullah, S. M., Halder, C., Santosh, K. C., Das, N., & Roy, K. (2018). Automatic line-level script identification from handwritten document images - A region-wise classification framework for indian subcontinent. Malaysian Journal of Computer Science, 31(1), 63–84. https://doi.org/10.22452/mjcs.vol31no1.5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free