Retrieval evaluation is always an important aspect in information retrieval (web search) and metrics are a key factor that needs to be carefully considered. In this paper, we propose a new method of measuring stability and discrimination power of a metric. The problem is initiated by Buckley and Voorhees. The advantage of the proposed method is that we are able to measure both aspects together in a systematic manner. Five metrics are tested in the study. They are average precision over all relevant documents, recall-level precision, normalized discount cumulative gain, precision at 10 documents level, and reciprocal rank. Experimental results show that normalized discount cumulative gain is the best, which is followed by average precision over all relevant documents, recall-level precision, precision at 10 documents level, while reciprocal rank is the worst. © 2013 Springer-Verlag.
CITATION STYLE
Shi, H., Tan, Y., Zhu, X., & Wu, S. (2013). Measuring stability and discrimination power of metrics in information retrieval evaluation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8206 LNCS, pp. 8–15). https://doi.org/10.1007/978-3-642-41278-3_2
Mendeley helps you to discover research relevant for your work.