How can the investigation of the hidden structure in a proximity matrix be condensed and targeted more towards domain expert knowledge? A data matrix that combines different statistical measures of the proximity matrix under investigation is proposed and evaluated. In order to validate the outcome and to measure how well this requirement is met the original and the compound matrix have to be compared. This is done by applying algorithms to determine cluster stabilities as introduced in Mucha and Haimerl (2005): A simulation algorithm finds the best number of clusters, calculates the stability of clusters found in hierarchical cluster analysis, and at the most detailed level calculates the rate of recovery by which an element can be reassigned to the same cluster in successive classifications of bootstrap samples. Both the cluster stability and the consistency of the clustering results with expert expectations prove the advantage of the compound matrix over the generally used proximity matrix. © Springer-Verlag Berlin Heidelberg 2010.
CITATION STYLE
Haimerl, E., & Mucha, H. J. (2010). Comparing the stability of clustering results of dialect data based on several distance matrices. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 665–672). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-642-10745-0_73
Mendeley helps you to discover research relevant for your work.