Abstract
We improve instability-based methods for the selection of the number of clusters k in cluster analysis by developing a corrected clustering distance that corrects for the unwanted influence of the distribution of cluster sizes on cluster instability. We show that our corrected instability measure outperforms current instability-based measures across the whole sequence of possible k, overcoming limitations of current insability-based methods for large k. We also compare, for the first time, model-based and model-free approaches to determining cluster-instability and find their performance to be comparable. We make our method available in the R-package cstab.
Author supplied keywords
Cite
CITATION STYLE
Haslbeck, J. M. B., & Wulff, D. U. (2020). Estimating the number of clusters via a corrected clustering instability. Computational Statistics, 35(4), 1879–1894. https://doi.org/10.1007/s00180-020-00981-5
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.