A comparative analysis of data privacy and utility parameter adjustment, using machine learning classification as a gauge

Kato Mivule; Claude Turner

Conference ProceedingsOPEN ACCESS

A comparative analysis of data privacy and utility parameter adjustment, using machine learning classification as a gauge

Procedia Computer Science (2013) 20 414-419

DOI: 10.1016/j.procs.2013.09.295

15Citations

33Readers

Abstract

During the data privacy process, the utility of datasets diminishes as sensitive information such as personal identifiable information (PII) is removed, transformed, or distorted to achieve confidentiality. The intractability of attaining an equilibrium between data privacy and utility needs is well documented, requiring trade-offs, and further complicated by the fact that making such trade-offs also remains problematic. Given such complexity, in this paper, we endeavor to empirically investigate what parameters could be fine-tuned to achieve an acceptable level of data privacy and utility during the data privacy process, while making reasonable trade-offs. Therefore, we present the comparative classification error gauge (Comparative x-CEG) approach, a data utility quantification concept that employs machine learning classification techniques to gauge data utility based on the classification error. In this approach, privatized datasets are passed through a series of classifiers, each of which returns a classification error, and the classifier with the lowest classification error is chosen; if the classification error is lower or equal to a set threshold then better utility might be achieved, otherwise, adjustment to the data privacy parameters are made to the chosen classifier. The process repeats x times until the desired threshold is reached. The goal is to generate empirical results after a range of parameter adjustments in the data privacy process, from which a threshold level might be chosen to make trade-offs. Our preliminary results show that given a range of empirical results, it might be possible to choose a tradeoff point and publish privacy compliant data with an acceptable level of utility. © 2013 The Authors. Published by Elsevier B.V.

Author supplied keywords

Cite

CITATION STYLE

APA

Mivule, K., & Turner, C. (2013). A comparative analysis of data privacy and utility parameter adjustment, using machine learning classification as a gauge. In Procedia Computer Science (Vol. 20, pp. 414–419). Elsevier B.V. https://doi.org/10.1016/j.procs.2013.09.295

A comparative analysis of data privacy and utility parameter adjustment, using machine learning classification as a gauge

Abstract

Author supplied keywords

Cite

Register to see more suggestions