Re-identification methods for masked microdata

William E. Winkler

Journal Article

Re-identification methods for masked microdata

Winkler W

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3050 216-230

DOI: 10.1007/978-3-540-25955-8_17

63Citations

17Readers

Get full text

Abstract

Statistical agencies often mask (or distort) microdata in public-use files so that the confidentiality of information associated with individual entities is preserved. The intent of many of the masking methods is to cause only minor distortions in some of the distributions of the data and possibly no distortion in a few aggregate or marginal statistics In record linkage (as in nearest neighbor methods), metrics are used to determine how close a value of a variable in a record is from the value of the corresponding variable in another record. If a sufficient number of variables in one record have values that are close to values in another record, then the records may be a match and correspond to the same entity. This paper shows that it is possible to create metrics for which re-identification is straightforward in many situations where masking is currently done. We begin by demonstrating how to quickly construct metrics for continuous variables that have been micro-aggregated one at a time using conventional methods. We extend the methods to situations where rank swapping is performed and discuss the situation where several continuous variables are micro-aggregated simultaneously. We close by indicating how metrics might be created for situations of synthetic microdata satisfying several sets of analytic constraints. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Winkler, W. E. (2004). Re-identification methods for masked microdata. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3050, 216–230. https://doi.org/10.1007/978-3-540-25955-8_17

Re-identification methods for masked microdata

Abstract

Cite

Register to see more suggestions