Score distributions in information retrieval

Avi Arampatzis; Stephen Robertson; Jaap Kamps

Conference Proceedings

Score distributions in information retrieval

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5766 LNCS 139-151

DOI: 10.1007/978-3-642-04417-5_13

12Citations

8Readers

Get full text

Abstract

We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distributions under some limiting conditions of parameter values. From all the mixtures suggested in the past, the current theoretical argument points to the two gamma as the most-likely universal model, with the normal-exponential being a usable approximation. Beyond the theoretical contribution, we provide new experimental evidence showing vector space or geometric models, and BM25, as being "friendly" to the normal-exponential, and that the non-convexity problem that the mixture possesses is practically not severe. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Arampatzis, A., Robertson, S., & Kamps, J. (2009). Score distributions in information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5766 LNCS, pp. 139–151). https://doi.org/10.1007/978-3-642-04417-5_13

Score distributions in information retrieval

Abstract

Cite

Register to see more suggestions