Explainable k-means: don't be greedy, plant bigger trees!

7Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We provide a new bi-criteria Õ(log2 k) competitive algorithm for explainable k-means clustering. Explainable k-means was recently introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). It is described by an easy to interpret and understand (threshold) decision tree or diagram. The cost of the explainable k-means clustering equals to the sum of costs of its clusters; and the cost of each cluster equals the sum of squared distances from the points in the cluster to the center of that cluster. The best non bi-criteria algorithm for explainable clustering Õ(k) competitive, and this bound is tight. Our randomized bi-criteria algorithm constructs a threshold decision tree that partitions the data set into (1+)k clusters (where (0,1) is a parameter of the algorithm). The cost of this clustering is at most Õ(1/· log2 k) times the cost of the optimal unconstrained k-means clustering. We show that this bound is almost optimal.

Author supplied keywords

Cite

CITATION STYLE

APA

Makarychev, K., & Shan, L. (2022). Explainable k-means: don’t be greedy, plant bigger trees! In Proceedings of the Annual ACM Symposium on Theory of Computing (pp. 1629–1642). Association for Computing Machinery. https://doi.org/10.1145/3519935.3520056

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free