VDMR-DBSCAN: Varied density mapreduce DBSCAN

Surbhi Bhardwaj; Subrat Kumar Dash

Conference Proceedings

VDMR-DBSCAN: Varied density mapreduce DBSCAN

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9498 134-150

DOI: 10.1007/978-3-319-27057-9_10

7Citations

4Readers

Get full text

Abstract

DBSCAN is a well-known density based clustering algorithm, which can discover clusters of different shapes and sizes along with outliers. However, it suffers from major drawbacks like high computational cost, inability to find varied density clusters and dependency on user provided input density parameters. To address these issues, we propose a novel density based clustering algorithm titled, VDMR-DBSCAN (Varied Density MapReduce DBSCAN), a scalable DBSCAN algorithm using MapReduce which can detect varied density clusters with automatic computation of input density parameters. VDMR-DBSCAN divides the data into small partitions which are parallely processed on Hadoop platform. Thereafter, density variations in a partition are analyzed statistically to divide the data into groups of similar density called Density level sets (DLS). Input density parameters are estimated for each DLS, later DBSCAN is applied on each DLS using its corresponding density parameters. Most importantly, we propose a novel merging technique, which merges the similar density clusters present in different partitions and produces meaningful and compact clusters of varied density. We experimented on large and small synthetic datasets which well confirms the efficacy of our algorithm in terms of scalability and ability to find varied density clusters.

Author supplied keywords

Cite

CITATION STYLE

APA

Bhardwaj, S., & Dash, S. K. (2015). VDMR-DBSCAN: Varied density mapreduce DBSCAN. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9498, pp. 134–150). Springer Verlag. https://doi.org/10.1007/978-3-319-27057-9_10

VDMR-DBSCAN: Varied density mapreduce DBSCAN

Abstract

Author supplied keywords

Cite

Register to see more suggestions