A DP canopy K-Means algorithm for privacy preservation of hadoop platform

8Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

K-means algorithm for data mining is combined with differential privacy preservation. Although it improves the security of data information, the selection of clustering number and initial center point is still blind and random. In this paper, we integrate an optimized Canopy algorithm with DP K-means algorithm, and apply it to Hadoop platform. Firstly, we optimize the Canopy algorithm according to the minimum and maximum principle and use the functions of the MapReduce framework to implement it. Secondly, we utilize the number and the set of center points obtained to implement the DP K-means algorithm on MapReduce. As a result, the improved Canopy algorithm can optimize the selection of the number of centers and clusters on Hadoop platform, so the proposed K-means algorithm can improve security, usability and efficiency of calculation.

Cite

CITATION STYLE

APA

Shang, T., Zhao, Z., Guan, Z., & Liu, J. (2017). A DP canopy K-Means algorithm for privacy preservation of hadoop platform. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10581 LNCS, pp. 189–198). Springer Verlag. https://doi.org/10.1007/978-3-319-69471-9_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free