Automatic indexing of newspaper microfilm images

Qing Hong Liu; Chew Lim Tan

Journal ArticleOPEN ACCESS

Automatic indexing of newspaper microfilm images

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2423 365-375

DOI: 10.1007/3-540-45869-7_41

0Citations

1Readers

Abstract

This paper describes a proposed document analysis system that aims at automatic indexing of digitized images of old newspaper microfilms. This is done by extracting news headlines from microfilm images. The headlines are then converted to machine readable text by OCR to serve as indices to the respective news articles. A major challenge to us is the poor image quality of the microfilm as most images are usually inadequately illuminated and considerably dirty. To overcome the problem we propose a new effective method for separating characters from noisy background since conventional threshold selection techniques are inadequate to deal with these kinds of images. A Run Length Smearing Algorithm (RLSA) is then applied to the headline extraction. Experimental results confirm the validity of the approach. © 2002 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Liu, Q. H., & Tan, C. L. (2002). Automatic indexing of newspaper microfilm images. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2423, 365–375. https://doi.org/10.1007/3-540-45869-7_41

Automatic indexing of newspaper microfilm images

Abstract

Cite

Register to see more suggestions