The release of survey microdata files requires a preliminary assessment of the disclosure risk of the data. Record-level risk measures can be useful for “local” protection (e.g. partially synthetic data [21], or local suppression [25]), and are also used in [22] and [16] to produce global risk measures [13] useful to assess data release. Whereas different proposals to estimating such risk measures are available in the literature, so far only a few attempts have been targeted to the evaluation of the statistical properties of these estimators. In this paper we pursue a simulation study that aims to evaluate the statistical properties of risk estimators. Besides presenting results about the Benedetti-Franconi individual risk estimator (see [11]), we also propose a strategy to produce improved risk estimates, and assess the latter by simulation. The problem of estimating per record reidentification risk enjoys many similarities with that of small area estimation (see [19]): we propose to introduce external information, arising from a previous census, in risk estimation. To achieve this we consider a simple strategy, namely Structure Preserving Estimation (SPREE) of Purcell and Kish [18], and show by simulation that this procedure provides better estimates of the individual risk of reidentification disclosure, especially for records whose risk is high.
CITATION STYLE
Consiglio, L. D., & Polettini, S. (2006). Improving individual risk estimators. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4302, pp. 243–256). Springer Verlag. https://doi.org/10.1007/11930242_21
Mendeley helps you to discover research relevant for your work.