Abstract
Recent years have seen a renewed interest in interpretable machine learning, which seeks insight into how a model achieves a prediction. Here, we focus on the relatively unexplored case of interpretable clustering. In our approach, the cluster assignments of the training instances are constrained to be the output of a decision tree. This has two advantages: 1) it makes it possible to understand globally how an instance is mapped to a cluster, in particular to see which features are used for which cluster; 2) it forces the clusters to respect a hierarchical structure while optimizing the original clustering objective function. Rather than the traditional axis-aligned trees, we use sparse oblique trees, which have far more modelling power, particularly with high-dimensional data, while remaining interpretable. Our approach applies to any clustering method which is defined by optimizing a cost function and we demonstrate it with two k-means variants.
Author supplied keywords
Cite
CITATION STYLE
Gabidolla, M., & Carreira-Perpiñán, M. A. (2022). Optimal Interpretable Clustering Using Oblique Decision Trees. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 400–410). Association for Computing Machinery. https://doi.org/10.1145/3534678.3539361
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.