We classify clustering algorithms into sequence-based techniques-which transform the object net into a linear sequence—and partition-based clustering algorithms. Tsangaris and Naughton [TN91, TN92] have shown that the partition-based techniques are superior. However, their work is based on a single partitioning algorithm, the Kernighart and Lin heuristics, which is not applicable to realistically large object bases because of its high running-time complexity. The contribution of this paper is two-fold: (1) we devise a new class of greedy object graph partitioning algorithms (GGP) whose running-time complexity is moderate while still yielding good quality results. (2) Our extensive quantitative analysis of all well-known partitioning algorithms indicates that no one algorithm performs superior for all object net characteristics. Therefore, we propose an adaptable clustering strategy according to a multi-dimensional grid: the dimensions correspond to particular characteristics of the object base-given by, e.g., number and size of objects, degree of object sharing-and the grid entries indicate the most suitable clustering algorithm for the particular configuration.
CITATION STYLE
Gerlhof, C., Kemper, A., Kilger, C., & Moerkotte, G. (1993). Partition-based clustering in object bases: From theory to practice. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 730 LNCS, pp. 301–316). Springer Verlag. https://doi.org/10.1007/3-540-57301-1_20
Mendeley helps you to discover research relevant for your work.