A Python library to check the level of anonymity of a dataset

8Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Openly sharing data with sensitive attributes and privacy restrictions is a challenging task. In this document we present the implementation of pyCANON, a Python library and command line interface (CLI) to check and assess the level of anonymity of a dataset through some of the most common anonymization techniques: k-anonymity, (α,k)-anonymity, ℓ-diversity, entropy ℓ-diversity, recursive (c,ℓ)-diversity, t-closeness, basic β-likeness, enhanced β-likeness and δ-disclosure privacy. For the case of more than one sensitive attribute, two approaches are proposed for evaluating these techniques. The main strength of this library is to obtain a full report of the parameters that are fulfilled for each of the techniques mentioned above, with the unique requirement of the set of quasi-identifiers and sensitive attributes. The methods implemented are presented together with the attacks they prevent, the description of the library, examples of the different functions’ usage, as well as the impact and the possible applications that can be developed. Finally, some possible aspects to be incorporated in future updates are proposed.

Cite

CITATION STYLE

APA

Sáinz-Pardo Díaz, J., & López García, Á. (2022). A Python library to check the level of anonymity of a dataset. Scientific Data, 9(1). https://doi.org/10.1038/s41597-022-01894-2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free