Semantic anonymisation of categorical datasets

Sergio Martínez; Aida Valls; David Sánchez

Journal Article

Semantic anonymisation of categorical datasets

Studies in Computational Intelligence (2015) 567 111-128

DOI: 10.1007/978-3-319-09885-2_7

0Citations

5Readers

Get full text

Abstract

The exploitation of microdata compiled by statistical agencies is of great interest for the data mining community. However, such data often include sensitive information that can be directly or indirectly related to individuals. Hence, an appropriate anonymisation process is needed to minimise the risk of disclosing identities and/or confidential data. In the past, many anonymisation methods have been developed to deal with numerical data, but approaches tackling the anonymisation of non-numerical values (e.g. categorical, textual) are scarce and shallow. Since the utility of this kind of information is closely related to the preservation of its meaning, in this work, the notion of semantic similarity is used to enable a semantically coherent anonymisation. The knowledge modelled in ontologies is used as the basic pillar to propose semantic operators that enable an accurate management and transformation of categorical attributes. These operators are then used in three anonymisation mechanisms: Semantic Recoding, Semantic and Adaptive Microaggregation and Semantic Resampling. The three algorithms are compared in terms of semantic utility, privacy disclosure risk and runtime, with encouraging results.

Cite

CITATION STYLE

APA

Martínez, S., Valls, A., & Sánchez, D. (2015). Semantic anonymisation of categorical datasets. Studies in Computational Intelligence, 567, 111–128. https://doi.org/10.1007/978-3-319-09885-2_7

Semantic anonymisation of categorical datasets

Abstract

Cite

Register to see more suggestions