In this paper we tackle the specific problem of old documents recovery. Spots, print through, underlines and others ageing features are undesirable not only because they harm the visual appearance of the document, but also because they affect future Optical Character Recognition (OCR). This paper proposes a new method integrating fuzzy clustering of color properties of original images and mathematical morphology. We will show that this technique leads to higher quality of the recovered images and, at the same time, it delivers cleaned binary text for OCR applications. The proposed method was applied to books of XIX Century, which were cleaned in a very effective way. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Caldas Pinto, J. R., Bandeira, L., Sousa, J. M. C., & Pina, P. (2005). Combining fuzzy clustering and morphological methods for old documents recovery. In Lecture Notes in Computer Science (Vol. 3523, pp. 387–394). Springer Verlag. https://doi.org/10.1007/11492542_48
Mendeley helps you to discover research relevant for your work.