In this paper, we propose cuSLINK, a novel and state-of-the-art reformulation of the SLINK algorithm on the GPU which requires only O(Nk) space and uses a parameter k to trade off space and time. We also propose a set of novel and reusable building blocks that compose cuSLINK. These building blocks include highly optimized computational patterns for k-NN graph construction, spanning trees, and dendrogram cluster extraction. We show how we used our primitives to implement cuSLINK end-to-end on the GPU, further enabling a wide range of real-world data mining and machine learning applications that were once intractable. In addition to being a primary computational bottleneck in the popular HDBSCAN algorithm, the impact of our end-to-end cuSLINK algorithm spans a large range of important applications, including cluster analysis in social and computer networks, natural language processing, and computer vision.
CITATION STYLE
Nolet, C. J., Gala, D., Fender, A., Doijade, M., Eaton, J., Raff, E., … Oates, T. (2023). cuSLINK: Single-Linkage Agglomerative Clustering on the GPU. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14169 LNAI, pp. 711–726). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43412-9_42
Mendeley helps you to discover research relevant for your work.