TI-DBSCAN: Clustering with DBSCAN by means of the triangle inequality

Marzena Kryszkiewicz; Piotr Lasek

Conference Proceedings

TI-DBSCAN: Clustering with DBSCAN by means of the triangle inequality

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6086 LNAI 60-69

DOI: 10.1007/978-3-642-13529-3_8

47Citations

14Readers

Get full text

Abstract

Grouping data into meaningful clusters is an important data mining task. DBSCAN is recognized as a high quality density-based algorithm for clustering data. It enables both the determination of clusters of any shape and the identification of noise in data. The most time-consuming operation in DBSCAN is the calculation of a neighborhood for each data point. In order to speed up this operation in DBSCAN, the neighborhood calculation is expected to be supported by spatial access methods. DBSCAN, nevertheless, is not efficient in the case of high dimensional data. In this paper, we propose a new efficient TI-DBSCAN algorithm and its variant TI-DBSCAN-REF that apply the same clustering methodology as DBSCAN. Unlike DBSCAN, TI-DBSCAN and TI-DBSCAN-REF do not use spatial indices; instead they use the triangle inequality property to quickly reduce the neighborhood search space. The experimental results prove that the new algorithms are up to three orders of magnitude faster than DBSCAN, and efficiently cluster both low and high dimensional data. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Kryszkiewicz, M., & Lasek, P. (2010). TI-DBSCAN: Clustering with DBSCAN by means of the triangle inequality. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6086 LNAI, pp. 60–69). https://doi.org/10.1007/978-3-642-13529-3_8

TI-DBSCAN: Clustering with DBSCAN by means of the triangle inequality

Abstract

Cite

Register to see more suggestions