sMARE: a new paradigm to evaluate and understand query performance prediction methods

16Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Query performance prediction (QPP) has been studied extensively in the IR community over the last two decades. A by-product of this research is a methodology to evaluate the effectiveness of QPP techniques. In this paper, we re-examine the existing evaluation methodology commonly used for QPP, and propose a new approach. Our key idea is to model QPP performance as a distribution instead of relying on point estimates. To obtain such distribution, we exploit the scaled Absolute Ranking Error (sARE) measure, and its mean the scaled Mean Absolute Ranking Error (sMARE). Our work demonstrates important statistical implications, and overcomes key limitations imposed by the currently used correlation-based point-estimate evaluation approaches. We also explore the potential benefits of using multiple query formulations and ANalysis Of VAriance (ANOVA) modeling in order to measure interactions between multiple factors. The resulting statistical analysis combined with a novel evaluation framework demonstrates the merits of modeling QPP performance as distributions, and enables detailed statistical ANOVA models for comparative analyses to be created.

Cite

CITATION STYLE

APA

Faggioli, G., Zendel, O., Culpepper, J. S., Ferro, N., & Scholer, F. (2022). sMARE: a new paradigm to evaluate and understand query performance prediction methods. Information Retrieval Journal, 25(2), 94–122. https://doi.org/10.1007/s10791-022-09407-w

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free