Comparing the stability of clustering results of dialect data based on several distance matrices

Edgar Haimerl; Hans Joachim Mucha

Conference Proceedings

Comparing the stability of clustering results of dialect data based on several distance matrices

Studies in Classification, Data Analysis, and Knowledge Organization (2010) 665-672

DOI: 10.1007/978-3-642-10745-0_73

0Citations

1Readers

Get full text

Abstract

How can the investigation of the hidden structure in a proximity matrix be condensed and targeted more towards domain expert knowledge? A data matrix that combines different statistical measures of the proximity matrix under investigation is proposed and evaluated. In order to validate the outcome and to measure how well this requirement is met the original and the compound matrix have to be compared. This is done by applying algorithms to determine cluster stabilities as introduced in Mucha and Haimerl (2005): A simulation algorithm finds the best number of clusters, calculates the stability of clusters found in hierarchical cluster analysis, and at the most detailed level calculates the rate of recovery by which an element can be reassigned to the same cluster in successive classifications of bootstrap samples. Both the cluster stability and the consistency of the clustering results with expert expectations prove the advantage of the compound matrix over the generally used proximity matrix. © Springer-Verlag Berlin Heidelberg 2010.

Cite

CITATION STYLE

APA

Haimerl, E., & Mucha, H. J. (2010). Comparing the stability of clustering results of dialect data based on several distance matrices. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 665–672). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-642-10745-0_73

Comparing the stability of clustering results of dialect data based on several distance matrices

Abstract

Cite

Register to see more suggestions