Probabilistic retrieval models usually rank documents based on a scalar quantity. However, such models lack any estimate for the uncertainty associated with a document's rank. Further, such models seldom have an explicit utility (or cost) that is optimized when ranking documents. To address these issues, we take a Bayesian perspective that explicitly considers the uncertainty associated with the estimation of the probability of relevance, and propose an asymmetric cost function for document ranking. Our cost function has the advantage of adjusting the risk in document retrieval via a single parameter for any probabilistic retrieval model. We use the logit model to transform the document posterior distribution with probability space [0,1] into a normal distribution with variable space (-8,+8). We apply our risk adjustment approach to a language modelling framework for risk adjustable document ranking. Our experimental results show that our risk-aware model can significantly improve the performance of lanuage models, both with and without background smoothing. When our method is applied to a language model without background smoothing, it can perform as well as the Dirichlet smoothing approach.
CITATION STYLE
Zhu, J., Wang, J., Taylor, M., & Cox, I. J. (2009). Risk-Aware information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5478 LNCS, pp. 17–28). https://doi.org/10.1007/978-3-642-00958-7_5
Mendeley helps you to discover research relevant for your work.