Constrained clustering-finding clusters that satisfy userspecified constraints|is highly desirable in many applications. In this paper, we introduce the constrained clustering problem and show that traditional clustering algorithms (e.g., k-means) cannot handle it. A scalable constraint-clustering algorithm is developed in this study which starts by finding an initial solution that satisfies user-specified constraints and then refines the solution by performing confined object movements under constraints. Our algorithm consists of two phases: pivot movement and deadlock resolution. For both phases, we show that finding the optimal solution is NP-hard. We then propose several heuristics and show how our algorithm can scale up for large data sets using the heuristic of micro-cluster sharing. By experiments, we show the effectiveness and effciency of the heuristics.
CITATION STYLE
Tung, A. K. H., Han, J., Lakshmanan, L. V. S., & Ng, R. T. (2001). Constraint-based clustering in large databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1973, pp. 405–419). Springer Verlag. https://doi.org/10.1007/3-540-44503-x_26
Mendeley helps you to discover research relevant for your work.