This paper presents a statistical model that interprets the evaluation of ranking methods as a random experiment. This model predicts the variability of evaluation results, so that appropriate significance tests for the results can be derived. The paper concludes with an empirical validation of the model on a collocation extraction task.
CITATION STYLE
Evert, S. (2004). Significance tests for the evaluation of ranking methods. In COLING 2004 - Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220355.1220491
Mendeley helps you to discover research relevant for your work.