Releasing raw data sets with sensitive personal information will leak privacy. Therefore, various differential privacy methods have been proposed for efficient data sharing while preserving privacy. However, they focus on noise processing of all quasi-identifier attributes, which results in high time-space complexity and low data utility. In this paper, we propose a Differential Privacy Protection model considering the Correlations between Attributes, denoted DPPCA. DPPCA first computes the degree of correlations between the quasi-identifier attributes and the sensitive attributes, and determines the pair of attributes with maximal degree of correlation. Then based the attributes with the maximal degree of correlations, it uses microaggregation to partition the data set into clusters of size k (k≥2) according to three types of attributes, i.e., numerical, non-numerical, and hybrid attributes, such that there are l (l < k) values of sensitive attributes in a cluster. Finally, noise is added to each cluster separately such that each cluster satisfies ϵ-differential privacy. While keeping the same degree of preserving privacy, our experimental results demonstrate that DPPCA substantially reduces the amount of added noise to 11% for the Census data set and the Adult data set. Therefore, DPPCA greatly improve the data utility while reaching the same degree of differential privacy.
CITATION STYLE
Yang, G., Ye, X., Fang, X., Wu, R., & Wang, L. (2020). Associated attribute-aware differentially private data publishing via microaggregation. IEEE Access, 8, 79158–79168. https://doi.org/10.1109/ACCESS.2020.2990296
Mendeley helps you to discover research relevant for your work.