Back to our roots for retrieving very short passages

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This article tackles the task of retrieving very short documents via even shorter queries. The problem on hand may relate to the retrieval of tweets, image and table captions, short text messages (SMS) and sponsored retrieval among others. In such cases, document and/or query expansion using thesauri and other external resources (e.g., Wikipedia) usually available on the World Wide Web (WWW) are proven to be effective approaches. However, the focus of this paper is on documents that are written in lesser known languages for which the WWW is of limited use. Our experiments are based on two main corpora extracted from historical manuscripts written in Latin and Middle High German. We found that retrieving very short documents whose lengths are quite similar via short queries given that no external enrichment resources are available, the classical tf-idf model performs as satisfactorily as the more complex models do, if not better sometimes.

Cite

CITATION STYLE

APA

Naji, N., & Savoy, J. (2013). Back to our roots for retrieving very short passages. In Proceedings of the ASIST Annual Meeting (Vol. 50). John Wiley and Sons Inc. https://doi.org/10.1002/meet.14505001035

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free