An automatic validation of hierarchical clustering based on resampling techniques is recommended that can be considered as a three level assessment of stability. The first and most general level is decision making about the appropriate number of clusters. The decision is based on measures of correspondence between partitions such as the adjusted Rand index. Second, the stability of each individual cluster is assessed based on measures of similarity between sets such as the Jaccard coefficient. In the third and most detailed level of validation, the reliability of the cluster membership of each individual observation can be assessed. The built-in validation is demonstrated on the wine data set from the UCI repository where both the number of clusters and the class membership are known beforehand.
CITATION STYLE
Mucha, H. J. (2007). On validation of hierarchical clustering. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 115–122). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-540-70981-7_14
Mendeley helps you to discover research relevant for your work.