The Dataset for Printed Brahmi Word Recognition

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Publicly available dataset is important for character, word or document recognition. The use of a standardized dataset will provide a fair or reliable comparison between the performances of the underlying recognition algorithms. Research on Brahmi words recognition had achieved encouraging results. However, there is no publicly available standardized Brahmi dataset. In this paper, the steps in producing a publicly available Brahmi dataset are presented. These steps include data collection, segmentation, storage, labeling, and statistical distribution. A total of 7,011 images of Brahmi characters were collected. The collected dataset is divided into three classes: vowel, consonants, and compound characters. In total, there are 170 classes with 4 of these classes belong to vowels, 27 classes of consonants, and 139 classes of compound characters. The 170 classes of characters are further divided into training and testing sets; 6,475 images in the training set while 536 images in the testing set.

Cite

CITATION STYLE

APA

Gautam, N., Chai, S. S., & Gautam, M. (2020). The Dataset for Printed Brahmi Word Recognition. In Lecture Notes in Networks and Systems (Vol. 106, pp. 125–133). Springer. https://doi.org/10.1007/978-981-15-2329-8_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free