We introduce the hypergeometric models KL, DLH and DLLH using the DFR approach, and we compare these models to other relevant models of IR. The hypergeometric models are based on the probability of observing two probabilities: the relative within-document term frequency and the entire collection term frequency. Hypergeometric models are parameter-free models of IR. Experiments show that these models have an excellent performance with small and very large collections. We provide their foundations from the same IR probability space of language modelling (LM). We finally discuss the difference between DFR and LM Briefly, DFR is a frequentist (Type 1), or combinatorial approach, whilst language models use a Bayesian (Type II) approach for mixing the two probabilities, being thus inherently parametric in its nature. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Amati, G. (2006). Frequentist and bayesian approach to information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3936 LNCS, pp. 13–24). https://doi.org/10.1007/11735106_3
Mendeley helps you to discover research relevant for your work.