Abstract
Hierarchical Clustering (HC) is a widely studied problem in exploratory data analysis, usually tackled by simple agglomerative procedures like average-linkage, single-linkage or complete-linkage. In this paper we focus on two objectives, introduced recently to give insight into the performance of average-linkage clustering: a similarity based HC objective proposed by [21] and a dissimilarity based HC objective proposed by [9]. In both cases, we present tight counterexamples showing that average-linkage cannot obtain better than 13 and 23 approximations respectively (in the worst-case), settling an open question raised in [21]. This matches the approximation ratio of a random solution, raising a natural question: can we beat average-linkage for these objectives? We answer this in the affirmative, giving two new algorithms based on semidefinite programming with provably better guarantees.
Cite
CITATION STYLE
Charikar, M., Chatziafratis, V., & Niazadeh, R. (2019). Hierarchical clustering better than average-linkage. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 2291–2304). Association for Computing Machinery. https://doi.org/10.1137/1.9781611975482.139
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.