The NVI clustering evaluation measure

31Citations
Citations of this article
126Readers
Mendeley users who have this article in their library.

Abstract

Clustering is crucial for many NLP tasks and applications. However, evaluating the results of a clustering algorithm is hard. In this paper we focus on the evaluation setting in which a gold standard solution is available. We discuss two existing information theory based measures, V and VI, and show that they are both hard to use when comparing the performance of different algorithms and different datasets. The V measure favors solutions having a large number of clusters, while the range of scores given by VI depends on the size of the dataset. We present a new measure, NVI, which normalizes VI to address the latter problem. We demonstrate the superiority of NVI in a large experiment involving an important NLP application, grammar induction, using real corpus data in English, German and Chinese. © 2009 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Reichart, R., & Rappoport, A. (2009). The NVI clustering evaluation measure. In CoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning (pp. 165–173). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1596374.1596401

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free