A quantitative analysis of the performance and scalability of de-identification tools for medical data

Zhiming Liu; Nafees Qamar; Jie Qian

Conference Proceedings

A quantitative analysis of the performance and scalability of de-identification tools for medical data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8315 274-289

DOI: 10.1007/978-3-642-53956-5_18

5Citations

4Readers

Get full text

Abstract

Recent developments in data de-identification technologies offer sophisticated solutions to protect medical data when, especially the data is to be provided for secondary purposes such as clinical or biomedical research. So as to determine to what degree an approach– along with its tool– is usable and effective, this paper takes into consideration a number of de-identification tools that aim at reducing the re-identification risk for the published medical data, yet preserving its statistical meanings. We therefore evaluate the residual risk of re-identification by conducting an experimental evaluation of the most stable research-based tools, as applied to our Electronic Health Records (EHRs) database, to assess which tool exhibits better performance with different quasiidentifiers. Our evaluation criteria are quantitative as opposed to other descriptive and qualitative assessments. We notice that on comparing individual disclosure risk and information loss of each published data, the μ-Argus tool performs better. Also, the generalization method is considerably better than the suppression method in terms of reducing risk and avoiding information loss. We also find that sdcMicro has the best scalability among its counterparts, as has been observed experimentally on a virtual data consisted of 33 variables and 10,000 records.

Cite

CITATION STYLE

APA

Liu, Z., Qamar, N., & Qian, J. (2014). A quantitative analysis of the performance and scalability of de-identification tools for medical data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8315, pp. 274–289). Springer Verlag. https://doi.org/10.1007/978-3-642-53956-5_18

A quantitative analysis of the performance and scalability of de-identification tools for medical data

Abstract

Cite

Register to see more suggestions