Privacy-preserving K-means clustering upon negative databases

Xiaoyi Hu; Liping Lu; Dongdong Zhao; Jianwen Xiang; Xing Liu; Haiying Zhou; Shengwu Xiong; Jing Tian

Conference Proceedings

Privacy-preserving K-means clustering upon negative databases

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11304 LNCS 191-204

DOI: 10.1007/978-3-030-04212-7_17

8Citations

5Readers

Get full text

Abstract

Data mining has become very popular with the arrival of big data era, but it also raises privacy issues. Negative database (NDB) is a new type of data representation which stores the negative image of data and can protect privacy while supporting some basic data mining operations such as classification and clustering. However, the existing clustering algorithm upon NDBs is based on Hamming distance, when facing datasets which have many categories for each attribute, the encoded data will become very long and resulting in low computational efficiency. In this paper, we propose a privacy-preserving k-means clustering algorithm based on Euclidean distance upon NDBs. The main step of k-means algorithm is to calculate the distance between each record and cluster centers, in order to solve the problem of privacy disclosure in this step, we transform each record in database into an NDB and propose a method to estimate Euclidean distance from a binary string and an NDB. Our work opens up new ideas for data mining upon negative database.

Author supplied keywords

Cite

CITATION STYLE

APA

Hu, X., Lu, L., Zhao, D., Xiang, J., Liu, X., Zhou, H., … Tian, J. (2018). Privacy-preserving K-means clustering upon negative databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11304 LNCS, pp. 191–204). Springer Verlag. https://doi.org/10.1007/978-3-030-04212-7_17

Privacy-preserving K-means clustering upon negative databases

Abstract

Author supplied keywords

Cite

Register to see more suggestions