Evaluating the Predictivity of IR Experiments

4Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Experimental evaluation is regarded as a critical element of any research activity in Information Retrieval, and is typically used to support assertions of the form "Technique A provides better retrieval effectiveness than does Technique B". Implicit in such claims are the characteristics of the data to which the results apply, in terms of both the queries used and the documents they were applied to. Here we explore the role of evaluation on a collection as a prediction of relative performance on collections that have different characteristics. In particular, by synthesizing new collections that vary from each other in a controlled way, we show that it is possible to explore the reliability of an IR evaluation pipeline, and to better understand the complex interrelationship between documents, queries, and metrics that is an important part of any experimental validation. Our results show that predictivity declines as the collection is varied, even in simple ways such as shifting in focus from one document source to another similar source.

Author supplied keywords

Cite

CITATION STYLE

APA

Rashidi, L., Zobel, J., & Moffat, A. (2021). Evaluating the Predictivity of IR Experiments. In SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1667–1671). Association for Computing Machinery, Inc. https://doi.org/10.1145/3404835.3463040

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free