Differential identifiability clustering algorithms for big data analysis

13Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Individual privacy preservation has become an important issue with the development of big data technology. The definition of ρ-differential identifiability (DI) precisely matches the legal definitions of privacy, which can provide an easy parameterization approach for practitioners so that they can set privacy parameters based on the privacy concept of individual identifiability. However, differential identifiability is currently only applied to some simple queries and achieved by Laplace mechanism, which cannot satisfy complex privacy preservation issues in big data analysis. In this paper, we propose a new exponential mechanism and composition properties of differential identifiability, and then apply differential identifiability to k-means and k-prototypes algorithms on MapReduce framework. DI k-means algorithm uses the usual Laplace mechanism and composition properties for numerical databases, while DI k-prototypes algorithm uses the new exponential mechanism and composition properties for mixed databases. The experimental results show that both DI k-means and DI k-prototypes algorithms satisfy differential identifiability.

Cite

CITATION STYLE

APA

Shang, T., Zhao, Z., Ren, X., & Liu, J. (2021). Differential identifiability clustering algorithms for big data analysis. Science China Information Sciences, 64(5). https://doi.org/10.1007/s11432-020-2910-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free