Data perturbation with random noise signals has been shown to be useful for data hiding in privacy-preserving data mining. Perturbar tion methods based on additive randomization allows accurate estimation of the Probability Density Function (PDF) via the ExpectationMaximization (EM) algorithm but it has been shown that noise-filtering techniques can be used to reconstruct the original data in many cases, leading to security breaches. In this paper, we propose a generic PDF reconstruction algorithm that can be used on non-additive (and additive) randomization techiques for the purpose of privacy-preserving data mining. This two-step reconstruction algorithm is based on Parzen-Window reconstruction and Quadratic Programming over a convex set - the probability simplex. Our algorithm eliminates the usual need for the iterative EM algorithm and it is generic for most randomization models. The simplicity of our two-step reconstruction algorithm, without iteration, also makes it attractive for use when dealing with large datasets. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Tan, V. Y. F., & Ng, S. K. (2007). Generic probability density function reconstruction for randomization in privacy-preserving data mining. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4571 LNAI, pp. 76–90). Springer Verlag. https://doi.org/10.1007/978-3-540-73499-4_7
Mendeley helps you to discover research relevant for your work.