Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

Daichi Amagata; Takahiro Hara

Conference ProceedingsOPEN ACCESS

Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

Proceedings of the ACM SIGMOD International Conference on Management of Data (2021) 49-61

DOI: 10.1145/3448016.3452781

22Citations

7Readers

Get full text

Abstract

Clustering multi-dimensional points is a fundamental task in many fields, and density-based clustering supports many applications as it can discover clusters of arbitrary shapes. This paper addresses the problem of Density-Peaks Clustering (DPC), a recently proposed density-based clustering framework. Although DPC already has many applications, its straightforward implementation incurs a quadratic time computation to the number of points in a given dataset, thereby does not scale to large datasets. To enable DPC on large datasets, we propose efficient algorithms for DPC. Specifically, we propose an exact algorithm, Ex-DPC, and two approximation algorithms, Approx-DPC and S-Approx-DPC. Under a reasonable assumption about a DPC parameter, our algorithms are sub-quadratic, i.e., break the quadratic barrier. Besides, Approx-DPC does not require any additional parameters and can return the same cluster centers as those of Ex-DPC, rendering an accurate clustering result. S-Approx-DPC requires an approximation parameter but can speed up its computational efficiency. We further present that their efficiencies can be accelerated by leveraging multicore processing. We conduct extensive experiments using synthetic and real datasets, and our experimental results demonstrate that our algorithms are efficient, scalable, and accurate.

Author supplied keywords

Cite

CITATION STYLE

APA

Amagata, D., & Hara, T. (2021). Fast Density-Peaks Clustering: Multicore-based Parallelization Approach. In Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 49–61). Association for Computing Machinery. https://doi.org/10.1145/3448016.3452781

Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

Abstract

Author supplied keywords

Cite

Register to see more suggestions