Privacy preserving data publishing of categorical data through k-anonymity and feature selection

17Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.

Abstract

In healthcare, there is a vast amount of patients' data, which can lead to important discoveries if combined. Due to legal and ethical issues, such data cannot be shared and hence such information is underused. A new area of research has emerged, called privacy preserving data publishing (PPDP), which aims in sharing data in a way that privacy is preserved while the information lost is kept at a minimum. In this Letter, a new anonymisation algorithm for PPDP is proposed, which is based on k-anonymity through pattern-based multidimensional suppression (kPB-MS). The algorithm uses feature selection for reducing the data dimensionality and then combines attribute and record suppression for obtaining k-anonymity. Five datasets from different areas of life sciences [RETINOPATHY, Single Proton Emission Computed Tomography imaging, gene sequencing and drug discovery (two datasets)], were anonymised with kPB-MS. The produced anonymised datasets were evaluated using four different classifiers and in 74% of the test cases, they produced similar or better accuracies than using the full datasets.

Cite

CITATION STYLE

APA

Aristodimou, A., Antoniades, A., & Pattichis, C. S. (2016). Privacy preserving data publishing of categorical data through k-anonymity and feature selection. Healthcare Technology Letters, 3(1), 16–21. https://doi.org/10.1049/htl.2015.0050

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free