Bridging language modeling and divergence from randomness models: A log-logistic model for IR

Stéphane Clinchant; Eric Gaussier

Conference Proceedings

Bridging language modeling and divergence from randomness models: A log-logistic model for IR

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5766 LNCS 54-65

DOI: 10.1007/978-3-642-04417-5_6

12Citations

10Readers

Get full text

Abstract

We are interested in this paper in revisiting the Divergence from Randomness (DFR) approach to Information Retrieval (IR), so as to better understand the different contributions it relies on, and thus be able to simplify it. To do so, we first introduce an analytical characterization of heuristic retrieval constraints and review several DFR models wrt this characterization. This review shows that the first normalization principle of DFR is necessary to make the model compliant with retrieval constraints. We then show that the log-logistic distribution can be used to derive a simplified DFR model. Interestingly, this simplified model contains Language Models (LM) with Jelinek-Mercer smoothing. The relation we propose here is, to our knowledge, the first connection between the DFR and LM approaches. Lastly, we present experimental results obtained on several standard collections which validate the simplification and the model we propose. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Clinchant, S., & Gaussier, E. (2009). Bridging language modeling and divergence from randomness models: A log-logistic model for IR. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5766 LNCS, pp. 54–65). https://doi.org/10.1007/978-3-642-04417-5_6

Bridging language modeling and divergence from randomness models: A log-logistic model for IR

Abstract

Cite

Register to see more suggestions