Evaluating the Predictivity of IR Experiments

Lida Rashidi; Justin Zobel; Alistair Moffat

Conference ProceedingsOPEN ACCESS

Evaluating the Predictivity of IR Experiments

SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (2021) 1667-1671

DOI: 10.1145/3404835.3463040

4Citations

7Readers

Get full text

Abstract

Experimental evaluation is regarded as a critical element of any research activity in Information Retrieval, and is typically used to support assertions of the form "Technique A provides better retrieval effectiveness than does Technique B". Implicit in such claims are the characteristics of the data to which the results apply, in terms of both the queries used and the documents they were applied to. Here we explore the role of evaluation on a collection as a prediction of relative performance on collections that have different characteristics. In particular, by synthesizing new collections that vary from each other in a controlled way, we show that it is possible to explore the reliability of an IR evaluation pipeline, and to better understand the complex interrelationship between documents, queries, and metrics that is an important part of any experimental validation. Our results show that predictivity declines as the collection is varied, even in simple ways such as shifting in focus from one document source to another similar source.

Author supplied keywords

Cite

CITATION STYLE

APA

Rashidi, L., Zobel, J., & Moffat, A. (2021). Evaluating the Predictivity of IR Experiments. In SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1667–1671). Association for Computing Machinery, Inc. https://doi.org/10.1145/3404835.3463040

Evaluating the Predictivity of IR Experiments

Abstract

Author supplied keywords

Cite

Register to see more suggestions