Constraint-based clustering selection

Toon Van Craenendonck; Hendrik Blockeel

Journal ArticleOPEN ACCESS

Constraint-based clustering selection

Machine Learning (2017) 106(9-10) 1497-1521

DOI: 10.1007/s10994-017-5643-7

23Citations

55Readers

Abstract

Clustering requires the user to define a distance metric, select a clustering algorithm, and set the hyperparameters of that algorithm. Getting these right, so that a clustering is obtained that meets the users subjective criteria, can be difficult and tedious. Semi-supervised clustering methods make this easier by letting the user provide must-link or cannot-link constraints. These are then used to automatically tune the similarity measure and/or the optimization criterion. In this paper, we investigate a complementary way of using the constraints: they are used to select an unsupervised clustering method and tune its hyperparameters. It turns out that this very simple approach outperforms all existing semi-supervised methods. This implies that choosing the right algorithm and hyperparameter values is more important than modifying an individual algorithm to take constraints into account. In addition, the proposed approach allows for active constraint selection in a more effective manner than other methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Van Craenendonck, T., & Blockeel, H. (2017). Constraint-based clustering selection. Machine Learning, 106(9–10), 1497–1521. https://doi.org/10.1007/s10994-017-5643-7

Constraint-based clustering selection

Abstract

Author supplied keywords

Cite

Register to see more suggestions