Chinese historic image threshold using adaptive K-means cluster and Bradley’s

5Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Resorting to extraction text techniques for Chinese heritage documents becomes an increasing need. Historic documents such as Chinese calligraphy usually were handwritten or scanned in low contrast so that an automatic optical character recognition procedure for document images analysis is difficult to apply. In this paper, we present a historic document image threshold based on a combination of Bradley’s algorithm and K-means. An adaptive K-means cluster as a pre-processing methods for document image has been used for automatically grouping the pixels of a document image into different homogeneous regions. In Bradley’s methods, every image’s pixel is set to black if its brightness is T percent lower than the average brightness of surrounding pixels in the window of the specified size, otherwise it is set to white. Finally, text bounding boxes are generated by concatenating neighboring word clusters with mathematical morphology method. Experimental results show that this algorithm is robust in dealing with non-uniform illuminated, low contrast historic document images in terms of both accuracy and efficiency.

Cite

CITATION STYLE

APA

Huang, Z. K., Ma, Y. L., Lu, L., Rao, F. X., & Hou, L. Y. (2016). Chinese historic image threshold using adaptive K-means cluster and Bradley’s. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9773, pp. 171–179). Springer Verlag. https://doi.org/10.1007/978-3-319-42297-8_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free