A comparative analysis of data privacy and utility parameter adjustment, using machine learning classification as a gauge

5Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

Abstract

During the data privacy process, the utility of datasets diminishes as sensitive information such as personal identifiable information (PII) is removed, transformed, or distorted to achieve confidentiality. The intractability of attaining an equilibrium between data privacy and utility needs is well documented, requiring trade-offs, and further complicated by the fact that making such trade-offs also remains problematic. Given such complexity, in this paper, we endeavor to empirically investigate what parameters could be fine-tuned to achieve an acceptable level of data privacy and utility during the data privacy process, while making reasonable trade-offs. Therefore, we present the comparative classification error gauge (Comparative x-CEG) approach, a data utility quantification concept that employs machine learning classification techniques to gauge data utility based on the classification error. In this approach, privatized datasets are passed through a series of classifiers, each of which returns a classification error, and the classifier with the lowest classification error is chosen; if the classification error is lower or equal to a set threshold then better utility might be achieved, otherwise, adjustment to the data privacy parameters are made to the chosen classifier. The process repeats x times until the desired threshold is reached. The goal is to generate empirical results after a range of parameter adjustments in the data privacy process, from which a threshold level might be chosen to make trade-offs. Our preliminary results show that given a range of empirical results, it might be possible to choose a tradeoff point and publish privacy compliant data with an acceptable level of utility. © 2013 The Authors. Published by Elsevier B.V.

Cite

CITATION STYLE

APA

Mivule, K., & Turner, C. (2013). A comparative analysis of data privacy and utility parameter adjustment, using machine learning classification as a gauge. In Procedia Computer Science (Vol. 20, pp. 414–419). Elsevier B.V. https://doi.org/10.1016/j.procs.2013.09.295

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free