Using explicit word co-occurrences to improve term-based text retrieval

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Reaching high precision and recall rates in the results of term-based queries on text collections is becoming more and more crucial, as long as the amount of available documents increases and their quality tends to decrease. In particular, retrieval techniques based on the strict correspondence between terms in the query and terms in the documents miss important and relevant documents where it just happens that the terms selected by their authors are slightly different than those used by the final user that issues the query. Our proposal is to explicitly consider term co-occurrences when building the vector space. Indeed, the presence in a document of different but related terms to those in the query should strengthen the confidence that the document is relevant as well. Missing a query term in a document, but finding several terms strictly related to it, should equally support the hypothesis that the document is actually relevant. The computational perspective that embeds such a relatedness consists in matrix operations that capture direct or indirect term co-occurrence in the collection. We propose two different approaches to enforce such a perspective, and run preliminary experiments on a prototypical implementation, suggesting that this technique is potentially profitable. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Ferilli, S., Biba, M., Basile, T. M. A., & Esposito, F. (2010). Using explicit word co-occurrences to improve term-based text retrieval. In Communications in Computer and Information Science (Vol. 91 CCIS, pp. 125–136). Springer Verlag. https://doi.org/10.1007/978-3-642-15850-6_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free