Big data are often characterized by a huge volume and a variety of attributes namely, numerical and categorical. To address this issue, this paper proposes an accelerated MapReduce-based k-prototypes method. The proposed method is based on pruning strategy to accelerate the clustering process by reducing the unnecessary distance computations between cluster centers and data points. Experiments performed on huge synthetic and real data sets show that the proposed method is scalable and improves the efficiency of the existing MapReduce-based k-prototypes method.
CITATION STYLE
HajKacem, M. A. B., Ben N’cir, C. E., & Essoussi, N. (2016). An accelerated mapreduce-based K-prototypes for big data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9946 LNCS, pp. 13–25). Springer Verlag. https://doi.org/10.1007/978-3-319-50230-4_2
Mendeley helps you to discover research relevant for your work.