Privacy-preserving K-means clustering upon negative databases

8Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data mining has become very popular with the arrival of big data era, but it also raises privacy issues. Negative database (NDB) is a new type of data representation which stores the negative image of data and can protect privacy while supporting some basic data mining operations such as classification and clustering. However, the existing clustering algorithm upon NDBs is based on Hamming distance, when facing datasets which have many categories for each attribute, the encoded data will become very long and resulting in low computational efficiency. In this paper, we propose a privacy-preserving k-means clustering algorithm based on Euclidean distance upon NDBs. The main step of k-means algorithm is to calculate the distance between each record and cluster centers, in order to solve the problem of privacy disclosure in this step, we transform each record in database into an NDB and propose a method to estimate Euclidean distance from a binary string and an NDB. Our work opens up new ideas for data mining upon negative database.

Cite

CITATION STYLE

APA

Hu, X., Lu, L., Zhao, D., Xiang, J., Liu, X., Zhou, H., … Tian, J. (2018). Privacy-preserving K-means clustering upon negative databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11304 LNCS, pp. 191–204). Springer Verlag. https://doi.org/10.1007/978-3-030-04212-7_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free