Text extraction using component analysis and neuro-fuzzy classification on complex backgrounds

Michael Makridis; Nikolaos E. Mitrakis; Nikolaos Nikolaou; Nikolaos Papamarkos

Conference ProceedingsOPEN ACCESS

Text extraction using component analysis and neuro-fuzzy classification on complex backgrounds

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6688 LNCS 742-751

DOI: 10.1007/978-3-642-21227-7_69

0Citations

5Readers

Abstract

This paper proposes a new technique for text extraction on complex color documents and cover books. The novelty of the proposed technique is that contrary to many existing techniques, it has been designed to deal successfully with documents having complex background, character size variations and different fonts. The number of colors of each document image is reduced automatically into a relative small number (usually below ten colors) and each document is divided into binary images. Then, connected component analysis is performed and homogenous groups of connected components (CCs) are created. A set of features is extracted for each group of CCs. Finally each group is classified into text or non-text classes using a neuro-fuzzy classifier. The proposed technique can be summarized into four consequent stages. In the first stage, a pre-processing algorithm filters noisy CCs. Afterwards, CC grouping is performed. Then, a set of nine local and global features is extracted for each group and finally a classification procedure detects document's text regions. Experimental results prove the efficiency of the proposed technique, which can be further extended to deal with even more complex text extraction problems. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Makridis, M., Mitrakis, N. E., Nikolaou, N., & Papamarkos, N. (2011). Text extraction using component analysis and neuro-fuzzy classification on complex backgrounds. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6688 LNCS, pp. 742–751). https://doi.org/10.1007/978-3-642-21227-7_69

Text extraction using component analysis and neuro-fuzzy classification on complex backgrounds

Abstract

Author supplied keywords

Cite

Register to see more suggestions