The Dataset for Printed Brahmi Word Recognition

Neha Gautam; Soo See Chai; Megha Gautam

Book Chapter

The Dataset for Printed Brahmi Word Recognition

Springer, (2020), 125-133

DOI: 10.1007/978-981-15-2329-8_13

0Citations

2Readers

Get full text

Abstract

Publicly available dataset is important for character, word or document recognition. The use of a standardized dataset will provide a fair or reliable comparison between the performances of the underlying recognition algorithms. Research on Brahmi words recognition had achieved encouraging results. However, there is no publicly available standardized Brahmi dataset. In this paper, the steps in producing a publicly available Brahmi dataset are presented. These steps include data collection, segmentation, storage, labeling, and statistical distribution. A total of 7,011 images of Brahmi characters were collected. The collected dataset is divided into three classes: vowel, consonants, and compound characters. In total, there are 170 classes with 4 of these classes belong to vowels, 27 classes of consonants, and 139 classes of compound characters. The 170 classes of characters are further divided into training and testing sets; 6,475 images in the training set while 536 images in the testing set.

Author supplied keywords

Cite

CITATION STYLE

APA

Gautam, N., Chai, S. S., & Gautam, M. (2020). The Dataset for Printed Brahmi Word Recognition. In Lecture Notes in Networks and Systems (Vol. 106, pp. 125–133). Springer. https://doi.org/10.1007/978-981-15-2329-8_13

The Dataset for Printed Brahmi Word Recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions