BM25 Beyond Query-Document Similarity

Billel Aklouche; Ibrahim Bounhas; Yahya Slimani

Conference Proceedings

BM25 Beyond Query-Document Similarity

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11811 LNCS 65-79

DOI: 10.1007/978-3-030-32686-9_5

0Citations

6Readers

Get full text

Abstract

The massive growth of information produced and shared online has made retrieving relevant documents a difficult task. Query Expansion (QE) based on term co-occurrence statistics has been widely applied in an attempt to improve retrieval effectiveness. However, selecting good expansion terms using co-occurrence graphs is challenging. In this paper, we present an adapted version of the BM25 model, which allows measuring the similarity between terms. First, a context window-based approach is applied over the entire corpus in order to construct the term co-occurrence graph. Afterward, using the proposed adapted version of BM25, candidate expansion terms are selected according to their similarity with the whole query. This measure stands out by its ability to evaluate the discriminative power of terms and select semantically related terms to the query. Experiments on two ad-hoc TREC collections (the standard Robust04 collection and the new TREC Washington Post collection) show that our proposal outperforms the baselines over three state-of-the-art IR models and leads to significant improvements in retrieval effectiveness.

Author supplied keywords

Cite

CITATION STYLE

APA

Aklouche, B., Bounhas, I., & Slimani, Y. (2019). BM25 Beyond Query-Document Similarity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11811 LNCS, pp. 65–79). Springer. https://doi.org/10.1007/978-3-030-32686-9_5

BM25 Beyond Query-Document Similarity

Abstract

Author supplied keywords

Cite

Register to see more suggestions