Bayesian Inferential Risk Evaluation on Multiple IR Systems

Rodger Benham; Ben Carterette; J. Shane Culpepper; Alistair Moffat

Conference ProceedingsOPEN ACCESS

Bayesian Inferential Risk Evaluation on Multiple IR Systems

SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (2020) 339-348

DOI: 10.1145/3397271.3401033

3Citations

14Readers

Get full text

Abstract

Information retrieval (IR) ranking models in production systems continually evolve in response to user feedback, insights from research, and new developments. Rather than investing all engineering resources to produce a single challenger to the existing system, a commercial provider might choose to explore multiple new ranking models simultaneously. However, even small changes to a complex model can have unintended consequences. In particular, the per-topic effectiveness profile is likely to change, and even when an overall improvement is achieved, gains are rarely observed for every query, introducing the risk that some users or queries may be negatively impacted by the new model if deployed into production. Risk adjustments that re-weight losses relative to gains and mitigate such behavior are available when making one-to-one system comparisons, but not for one-to-many or many-to-one comparisons. Moreover, no IR evaluation methodology integrates priors from previous or alternative rankers in a homogeneous inferential framework. In this work, we propose a Bayesian approach where multiple challengers are compared to a single champion. We also show that risk can be incorporated, and demonstrate the benefits of doing so. Finally, the alternative scenario that is commonly encountered in academic research is also considered, when a single challenger is compared against several previous champions.

Author supplied keywords

Cite

CITATION STYLE

APA

Benham, R., Carterette, B., Shane Culpepper, J., & Moffat, A. (2020). Bayesian Inferential Risk Evaluation on Multiple IR Systems. In SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 339–348). Association for Computing Machinery, Inc. https://doi.org/10.1145/3397271.3401033

Bayesian Inferential Risk Evaluation on Multiple IR Systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions