We have found that the nearest neighbor (NN) test is an insufficient measure of the cluster hypothesis. The NN test is a local measure of the cluster hypothesis. Designers of new document-to-document similarity measures may incorrectly report effective clustering of relevant documents if they use the NN test alone. Utilizing a measure from network analysis, we present a new, global measure of the cluster hypothesis: normalized mean reciprocal distance. When used together with a local measure, such as the NN test, this new global measure allows researchers to better measure the cluster hypothesis. © 2009 Springer Berlin Heidelberg.
CITATION STYLE
Smucker, M. D., & Allan, J. (2009). A new measure of the cluster hypothesis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5766 LNCS, pp. 281–288). https://doi.org/10.1007/978-3-642-04417-5_27
Mendeley helps you to discover research relevant for your work.