Optimal Interpretable Clustering Using Oblique Decision Trees

Magzhan Gabidolla; Miguel A. Carreira-Perpiñán

Conference ProceedingsOPEN ACCESS

Optimal Interpretable Clustering Using Oblique Decision Trees

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2022) 400-410

DOI: 10.1145/3534678.3539361

20Citations

9Readers

Abstract

Recent years have seen a renewed interest in interpretable machine learning, which seeks insight into how a model achieves a prediction. Here, we focus on the relatively unexplored case of interpretable clustering. In our approach, the cluster assignments of the training instances are constrained to be the output of a decision tree. This has two advantages: 1) it makes it possible to understand globally how an instance is mapped to a cluster, in particular to see which features are used for which cluster; 2) it forces the clusters to respect a hierarchical structure while optimizing the original clustering objective function. Rather than the traditional axis-aligned trees, we use sparse oblique trees, which have far more modelling power, particularly with high-dimensional data, while remaining interpretable. Our approach applies to any clustering method which is defined by optimizing a cost function and we demonstrate it with two k-means variants.

Author supplied keywords

Cite

CITATION STYLE

APA

Gabidolla, M., & Carreira-Perpiñán, M. A. (2022). Optimal Interpretable Clustering Using Oblique Decision Trees. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 400–410). Association for Computing Machinery. https://doi.org/10.1145/3534678.3539361

Optimal Interpretable Clustering Using Oblique Decision Trees

Abstract

Author supplied keywords

Cite

Register to see more suggestions