Methods to mitigate risk of composition attack in independent data publications

Jiuyong Li; Sarowar A. Sattar; Muzammil M. Baig; Jixue Liu; Raymond Heatherly; Qiang Tang; Bradley Malin

Book Chapter

Methods to mitigate risk of composition attack in independent data publications

Springer International Publishing, (2015), 179-200

DOI: 10.1007/978-3-319-23633-9_8

6Citations

6Readers

Get full text

Abstract

Data publication is a simple and cost-effective approach for data sharing across organizations. Data anonymization is a central technique in privacy preserving data publications. Many methods have been proposed to anonymize individual datasets and multiple datasets of the same data publisher. In real life, a dataset is rarely isolated and two datasets published by two organizations may contain the records of the same individuals. For example, patients might have visited two hospitals for follow-up or specialized treatment regarding a disease, and their records are independently anonymized and published. Although each published dataset poses a small privacy risk, the intersection of two datasets may severely compromise the privacy of the individuals. The attack using the intersection of datasets published by different organizations is called a composition attack. Some research work has been done to study methods for anonymizing data to prevent a composition attack for independent data releases where one data publisher has no knowledge of records of another data publisher. In this chapter, we discuss two exemplar methods, a randomization based and a generalization based approaches, to mitigate risks of composition attacks. In the randomization method, noise is added to the original values to make it difficult for an adversary to pinpoint an individual’s record in a published dataset. In the generalization method, a group of records according to potentially identifiable attributes are generalized to the same so that individuals are indistinguishable. We discuss and experimentally demonstrate the strengths and weaknesses of both types of methods. We also present a mixed data publication framework where a small proportion of the records are managed and published centrally and other records are managed and published locally in different organizations to reduce the risk of the composition attack and improve the overall utility of the data.

Cite

CITATION STYLE

APA

Li, J., Sattar, S. A., Baig, M. M., Liu, J., Heatherly, R., Tang, Q., & Malin, B. (2015). Methods to mitigate risk of composition attack in independent data publications. In Medical Data Privacy Handbook (pp. 179–200). Springer International Publishing. https://doi.org/10.1007/978-3-319-23633-9_8

Methods to mitigate risk of composition attack in independent data publications

Abstract

Cite

Register to see more suggestions