In general, just suppressing identifiers from released microdata is insufficient for privacy protection. It has been shown that the risk of re-identification increases with the dimensionality of the released records. Hence, sound anonymization procedures are needed to anonymize high-dimensional records. Unfortunately, most privacy models yield very poor utility if enforced on data sets with many attributes. In this paper, we propose a method based on principal component analysis (PCA) to mitigate the curse of dimensionality in anonymization. Our aim is to reduce dimensionality without incurring large utility losses. We instantiate our approach with anonymization based on differential privacy. Empirical work shows that using differential privacy on the PCA-transformed and dimensionality-reduced data set yields less information loss than directly using differential privacy on the original data set.
CITATION STYLE
Soria-Comas, J., & Domingo-Ferrer, J. (2019). Mitigating the Curse of Dimensionality in Data Anonymization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11676 LNAI, pp. 346–355). Springer Verlag. https://doi.org/10.1007/978-3-030-26773-5_30
Mendeley helps you to discover research relevant for your work.