Separating indic scripts with ‘matra’—A precursor to script identification in multi-script documents

6Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Here, we present a new technique for separating Indic scripts based on matra (or shirorekha), where an optimized fractal geometry analysis (FGA) is used as the sole pertinent feature. Separating those scripts having matra from those which do not have one, can be used as a precursor to ease the subsequent script identification process. In our work, we consider two matra-based scripts namely Bangla and Devanagari as positive samples, and the counter samples are obtained from two different scripts namely Roman and Urdu. Altogether, we took 1204 document images with a distribution of 525 matra-based (325 Bangla and 200 Devanagari) and 679 without matra-based (370 Roman and 309 Urdu) scripts. For experimentation, we have used three different classifiers: multilayer perceptron (MLP), random forest (RF), and BayesNet (BN), with the target of selecting the best performer. From a series of test, we achieved an average accuracy of 96.44% from MLP classifier.

Cite

CITATION STYLE

APA

Obaidullah, S. M., Goswami, C., Santosh, K. C., Halder, C., Das, N., & Roy, K. (2017). Separating indic scripts with ‘matra’—A precursor to script identification in multi-script documents. In Advances in Intelligent Systems and Computing (Vol. 459 AISC, pp. 205–214). Springer Verlag. https://doi.org/10.1007/978-981-10-2104-6_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free