Query length, number of classes and routes through clusters: Experiments with a clustering method for information retrieval

Patrice Bellot; Marc El-Bèze

Conference Proceedings

Query length, number of classes and routes through clusters: Experiments with a clustering method for information retrieval

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1999) 1749 196-205

DOI: 10.1007/978-3-540-46652-9_19

4Citations

3Readers

Get full text

Abstract

A classical information retrieval system ranks documents according to distances between texts and a user query. The answer list is often so long that users cannot examine all the documents retrieved whereas some relevant ones are badly ranked and thus never retrieved. To solve this problem, retrieved documents are automatically clustered. We describe an algorithm based on hierarchical and clustering methods. It classifies the set of documents retrieved by any IR-system. This method is evaluated over the TREC-7 corpora and queries. We show that it improves the results of the retrieval by providing users at least one high precision cluster. The impact of the number of clusters and the way to browse them to build a reordered list are examined. Over TREC corpora and queries, we show that the choice of the number of clusters according to the length of queries improves results compared with a prefixed number.

Cite

CITATION STYLE

APA

Bellot, P., & El-Bèze, M. (1999). Query length, number of classes and routes through clusters: Experiments with a clustering method for information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1749, pp. 196–205). Springer Verlag. https://doi.org/10.1007/978-3-540-46652-9_19

Query length, number of classes and routes through clusters: Experiments with a clustering method for information retrieval

Abstract

Cite

Register to see more suggestions