k-anonymity is an approach for enabling privacy-preserving data publishing of personal, sensitive data. As a result of this anonymisation process, the utility of the sanitised data is generally lower than on the original data. Quantifying this utility loss is therefore important to estimate the usefulness of the resulting datasets. In this paper, we analyse several of these utility aspects. Data utility can be measured as a direct property of the resulting, anonymised dataset, or via the effectiveness that a statistical analysis, such as a machine learning model, achieves upon this dataset, as compared to the original data. While the latter is more tailored to the specific dataset, it is also generally less efficient. We therefore analyse whether there is a correlation between these two types of measures, and whether the measurement on the effectiveness can be substituted by a measurement of the data properties. Further, we evaluate to what extent different solutions for the same level of k-anonymity differ in regards to effectiveness.
CITATION STYLE
Šarčević, T., Molnar, D., & Mayer, R. (2020). An Analysis of Different Notions of Effectiveness in k-Anonymity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12276 LNCS, pp. 121–135). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-57521-2_9
Mendeley helps you to discover research relevant for your work.