Constraint-based clustering in large databases

96Citations
Citations of this article
50Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Constrained clustering-finding clusters that satisfy userspecified constraints|is highly desirable in many applications. In this paper, we introduce the constrained clustering problem and show that traditional clustering algorithms (e.g., k-means) cannot handle it. A scalable constraint-clustering algorithm is developed in this study which starts by finding an initial solution that satisfies user-specified constraints and then refines the solution by performing confined object movements under constraints. Our algorithm consists of two phases: pivot movement and deadlock resolution. For both phases, we show that finding the optimal solution is NP-hard. We then propose several heuristics and show how our algorithm can scale up for large data sets using the heuristic of micro-cluster sharing. By experiments, we show the effectiveness and effciency of the heuristics.

Cite

CITATION STYLE

APA

Tung, A. K. H., Han, J., Lakshmanan, L. V. S., & Ng, R. T. (2001). Constraint-based clustering in large databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1973, pp. 405–419). Springer Verlag. https://doi.org/10.1007/3-540-44503-x_26

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free