Many applications require clustering data using an unsupervised approach. One such clustering algorithm is Dbscan, which is inherently sequential, thus limiting parallelization opportunities. Consequently, several recent works have proposed novel shared- and distributed-memory approaches for scaling Dbscan. We propose BPS-HDbscan, a shared-memory CPU/GPU approach that clusters on the billion-point scale. The major pillars of BPS-HDbscan are as follows: (i) distance calculation avoidance in dense data regions; (ii) efficient merging of subclusters; (iii) obviating limited GPU memory capacity by both batching the result set and partitioning the input dataset; and, (iv) computing data partitions in parallel, which effectively exploits both CPU and GPU resources. BPS-HDbscan is highly efficient, and to our knowledge, is the first shared-memory Dbscan algorithm to cluster on the billion point scale.
CITATION STYLE
Gowanlock, M. (2019). Hybrid CPU/GPU clustering in shared memory on the billion point scale. In Proceedings of the International Conference on Supercomputing (pp. 35–45). Association for Computing Machinery. https://doi.org/10.1145/3330345.3330349
Mendeley helps you to discover research relevant for your work.