Retrieval from document image collections

A. Balasubramanian; Million Meshesha; C. V. Jawahar

Conference ProceedingsOPEN ACCESS

Retrieval from document image collections

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 3872 LNCS 1-12

DOI: 10.1007/11669487_1

38Citations

19Readers

Abstract

This paper presents a system for retrieval of relevant documents from large document image collections. We achieve effective search and retrieval from a large collection of printed document images by matching image features at word-level. For representations of the words, profile-based and shape-based features are employed. A novel DTW-based partial matching scheme is employed to take care of morphologically variant words, This is useful for grouping together similar words during the indexing process. The system supports cross-lingual search using OM-Trans transliteration and a dictionary-based approach. System-level issues for retrieval (eg. scalability, effective delivery etc.) are addressed in this paper. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Balasubramanian, A., Meshesha, M., & Jawahar, C. V. (2006). Retrieval from document image collections. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3872 LNCS, pp. 1–12). https://doi.org/10.1007/11669487_1

Retrieval from document image collections

Abstract

Cite

Register to see more suggestions