Differential identifiability clustering algorithms for big data analysis

Tao Shang; Zheng Zhao; Xujie Ren; Jianwei Liu

Journal ArticleOPEN ACCESS

Differential identifiability clustering algorithms for big data analysis

Science China Information Sciences (2021) 64(5)

DOI: 10.1007/s11432-020-2910-1

13Citations

10Readers

Get full text

Abstract

Individual privacy preservation has become an important issue with the development of big data technology. The definition of ρ-differential identifiability (DI) precisely matches the legal definitions of privacy, which can provide an easy parameterization approach for practitioners so that they can set privacy parameters based on the privacy concept of individual identifiability. However, differential identifiability is currently only applied to some simple queries and achieved by Laplace mechanism, which cannot satisfy complex privacy preservation issues in big data analysis. In this paper, we propose a new exponential mechanism and composition properties of differential identifiability, and then apply differential identifiability to k-means and k-prototypes algorithms on MapReduce framework. DI k-means algorithm uses the usual Laplace mechanism and composition properties for numerical databases, while DI k-prototypes algorithm uses the new exponential mechanism and composition properties for mixed databases. The experimental results show that both DI k-means and DI k-prototypes algorithms satisfy differential identifiability.

Author supplied keywords

Cite

CITATION STYLE

APA

Shang, T., Zhao, Z., Ren, X., & Liu, J. (2021). Differential identifiability clustering algorithms for big data analysis. Science China Information Sciences, 64(5). https://doi.org/10.1007/s11432-020-2910-1

Differential identifiability clustering algorithms for big data analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions