A distributed shared nearest neighbors clustering algorithm

N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Current data processing tasks require efficient approaches capable of dealing with large databases. A promising strategy consists in distributing the data along several computers that partially solves the undertaken problem. Then, these partial answers are integrated in order to obtain a final solution. We introduce the Distributed Shared Nearest Neighbor based clustering algorithm (D-SNN) which is able to work with disjoint partitions of data producing a global clustering solution that achieves a competitive performance regarding centralized approaches. Our algorithm is suited for large scale problems (e.g, text clustering) where data cannot be handled by a single machine due to memory size constraints. Experimental results over five data sets show that our proposal is competitive in terms of standard clustering quality performance measures.

Cite

CITATION STYLE

APA

Zamora, J., Allende-Cid, H., & Mendoza, M. (2018). A distributed shared nearest neighbors clustering algorithm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10657 LNCS, pp. 710–718). Springer Verlag. https://doi.org/10.1007/978-3-319-75193-1_85

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free