Digitization of ancient palm leaf manuscripts is gaining momentum due to the limited datasets and complex features of text images of palm leaf manuscripts. Thus far, the previous studies did not deeply analyze the application of the trending techniques on the palm leaf manuscripts, considering how deep learning approaches require large datasets, while some isolated glyphs contain more than one character with complex grammatical components. Therefore, this paper explores the possibilities and practical methods for improving isolated glyph classification. In particular, we focus on both the front-end and the back-end processes involved in the image classification task. For the front-end analysis, we present multi-task preprocessing techniques, including data augmentation techniques, new datasets extraction, and image enhancement techniques to increase the quality and quantity of datasets. For the back-end side, we aim to study the visual backbones of deep learning techniques, especially CNNs (including VGG, ResNet, and EfficientNet) and attention-based models (including ViT, DeiT, and CvT). Furthermore, the analysis and evaluation examined how data augmentation techniques and preprocessing interact with the amount of data used in training. Evidently, we experimented on three palm leaf manuscripts, including Balinese, Sundanese, and Khmer scripts from the ICFHR contest 2018, SluekRith, AMDI LontarSet, and Sunda datasets. Regarding the quality of research, the experiment delivers an effective way of training palm leaf datasets for the document analysis community.
CITATION STYLE
Thuon, N., Du, J., & Zhang, J. (2022). Improving Isolated Glyph Classification Task for Palm Leaf Manuscripts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13639 LNCS, pp. 65–79). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21648-0_5
Mendeley helps you to discover research relevant for your work.