Research on text line segmentation of historical Tibetan documents based on the connected component analysis

8Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Text line segmentation is one of the critical content in handwriting documents recognition especially in the historical documents’ analysis and recognition. Because of the low quality and the complexity of these documents (background noise, scattered character, touching components between consecutive lines), automatic text line segmentation remains to be a hot spot for researching. In this paper we propose a new method to segment the text line from the historical Tibetan scripture “kangjur” of the Beijing version on the paper by means of woodcut. This method first performs document image skew detection and correction, using projection profiles to get the baseline of text line, then the connected component is allocated to text line according to the location relationship. For some connected components, analyzing their location and sharp to assign these connected components correctly. This method using connected component instead of pixels, avoiding the noise generated by splitting characters. Experiments show that this method is effective in copes with touching text lines and promising in text line segmentation from historical Tibetan document.

Cite

CITATION STYLE

APA

Wang, Y., Wang, W., Li, Z., Han, Y., & Wang, X. (2018). Research on text line segmentation of historical Tibetan documents based on the connected component analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11258 LNCS, pp. 74–87). Springer Verlag. https://doi.org/10.1007/978-3-030-03338-5_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free