Query length, number of classes and routes through clusters: Experiments with a clustering method for information retrieval

4Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A classical information retrieval system ranks documents according to distances between texts and a user query. The answer list is often so long that users cannot examine all the documents retrieved whereas some relevant ones are badly ranked and thus never retrieved. To solve this problem, retrieved documents are automatically clustered. We describe an algorithm based on hierarchical and clustering methods. It classifies the set of documents retrieved by any IR-system. This method is evaluated over the TREC-7 corpora and queries. We show that it improves the results of the retrieval by providing users at least one high precision cluster. The impact of the number of clusters and the way to browse them to build a reordered list are examined. Over TREC corpora and queries, we show that the choice of the number of clusters according to the length of queries improves results compared with a prefixed number.

Cite

CITATION STYLE

APA

Bellot, P., & El-Bèze, M. (1999). Query length, number of classes and routes through clusters: Experiments with a clustering method for information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1749, pp. 196–205). Springer Verlag. https://doi.org/10.1007/978-3-540-46652-9_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free