Abstract
We introduce a new technique for analyzing combination models. The technique allows us to make qualitative conclusions about which IR systems should be combined. We achieve this by using a linear regression to accurately (r2 = 0.98) predict the performance of the combined system based on quantitative measurements of individual component systems taken from TREC5. When applied to a linear model (weighted sum of relevance scores), the technique supports several previously suggested hypotheses: one should maximize both the individual systems' performances and the overlap of relevant documents between systems, while minimizing the overlap of nonrelevant documents. It also suggests new conclusions: both systems should distribute scores similarly, but not rank relevant documents similarly. It furthermore suggests that the linear model is only able to exploit a fraction of the benefit possible from combination. The technique is general in nature and capable of pointing out the strengths and weaknesses of any given combination approach.
Cite
CITATION STYLE
Vogt, C. C., & Cottrell, G. W. (1998). Predicting the Performance of Linearly Combined IR Systems. In SIGIR 1998 - Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 190–196). Association for Computing Machinery, Inc. https://doi.org/10.1145/290941.290991
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.