Parallel bat algorithm-based clustering using mapreduce

Tripathi Ashish; Sharma Kapil; Bala Manju

Book Chapter

Parallel bat algorithm-based clustering using mapreduce

Springer Science and Business Media Deutschland GmbH, (2018), 73-82

DOI: 10.1007/978-981-10-4600-1_7

40Citations

18Readers

Get full text

Abstract

As we are going through the era of big data where the size of the data is increasing very rapidly resulting into the failure of traditional clustering methods on such a massive data sets. If the size of data exceeds the storage capacity or memory of the system, the task of clustering will become more complex and time intensive. To overcome this problem, this paper proposes a fast and efficient parallel bat algorithm (PBA) for the data clustering using the map-reduce architecture. Efficient using the evolutionary approach for clustering purpose rather than using traditional algorithm like k-means and fast by paralyzing it using the Hadoop and map-reduce architecture. The PBA algorithm works by dividing the large data set into small blocks and clustering these smaller data blocks in parallel. The proposed algorithm inherits the bat algorithm features to cluster the data set. The proposed algorithm is validated on five benchmark data sets against particle swarm optimization with different number of nodes. Experimental results show that the PBA algorithm is giving competitive results as compared to the particle swarm optimization and also providing the significant speedup with increasing number of nodes.

Author supplied keywords

Cite

CITATION STYLE

APA

Ashish, T., Kapil, S., & Manju, B. (2018). Parallel bat algorithm-based clustering using mapreduce. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 4, pp. 73–82). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-10-4600-1_7

Parallel bat algorithm-based clustering using mapreduce

Abstract

Author supplied keywords

Cite

Register to see more suggestions