Revisiting N-gram based models for retrieval in degraded large collections

13Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The traditional retrieval models based on term matching are not effective in collections of degraded documents (output of OCR or ASR systems for instance). This paper presents a n-gram based distributed model for retrieval on degraded text large collections. Evaluation was carried out with both the TREC Confusion Track and Legal Track collections showing that the presented approach outperforms in terms of effectiveness the classical term centred approach and the most of the participant systems in the TREC Confusion Track. © Springer-Verlag Berlin Heidelberg 2009.

Cite

CITATION STYLE

APA

Parapar, J., Freire, A., & Barreiro, Á. (2009). Revisiting N-gram based models for retrieval in degraded large collections. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5478 LNCS, pp. 680–684). https://doi.org/10.1007/978-3-642-00958-7_66

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free