K-anonymity is the most widely used technology in the field of privacy preservation. It has a good performance particularly in protecting data privacy in the scenarios of data publication, location-based service and social network. In this paper, we propose a new algorithm to achieve k-anonymity in a better way through improved clustering, and we optimize the clustering process by considering the overall distribution of quasi-identifier groups in a multidimensional space. With the local optimal clustering, we try our best to guarantee minimized intra-cluster distances and maximized inter-cluster distances. Therefore, the quality of anonymized data can be greatly improved. Compared with some popular algorithms like k-member, Mondrian, and one-time k-means, the experimental results show our algorithm can effectively reduce the information loss while generating equivalence classes. The total information loss of the anonymized dataset decreases by about 20% on average than that of other algorithms. It also performs well in dealing with both numerical attributes and categorical attributes.
CITATION STYLE
Zheng, W., Wang, Z., Lv, T., Ma, Y., & Jia, C. (2018). K-anonymity algorithm based on improved clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11335 LNCS, pp. 462–476). Springer Verlag. https://doi.org/10.1007/978-3-030-05054-2_36
Mendeley helps you to discover research relevant for your work.