On OCR of degraded documents using fuzzy multifactorial analysis

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Optical Character Recognition (OCR) systems show poor performance while processing documents like old books or newspapers, Xerox materials, faxed documents, etc. Such documents are considered as degraded documents. One of the important reasons for poor recognition rate for degraded documents is existence of touching or connected characters, which create a major problem for designing an effective character segmentation procedure. In this paper, a new technique is proposed for segmentation of touching characters. The technique is based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting cut-points to segment touching characters. Initially, our proposed method has been applied for segmenting touching characters that appear in Devnagari (Hindi) and Bangla, two major scripts in Indian sub-continent. The results obtained from a test-set of considerable size show that a high recognition rate can be achieved with a reasonable amount of computations.

Cite

CITATION STYLE

APA

Garain, U., & Chaudhuri, B. B. (2002). On OCR of degraded documents using fuzzy multifactorial analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2275, pp. 388–394). Springer Verlag. https://doi.org/10.1007/3-540-45631-7_52

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free