Pseudonymization for research data collection: Is the juice worth the squeeze?

Florian Kohlmayer; Ronald Lautenschläger; Fabian Prasser

Journal ArticleOPEN ACCESS

Pseudonymization for research data collection: Is the juice worth the squeeze?

BMC Medical Informatics and Decision Making (2019) 19(1)

DOI: 10.1186/s12911-019-0905-x

14Citations

79Readers

Abstract

Background: The collection of data and biospecimens which characterize patients and probands in-depth is a core element of modern biomedical research. Relevant data must be considered highly sensitive and it needs to be protected from unauthorized use and re-identification. In this context, laws, regulations, guidelines and best-practices often recommend or mandate pseudonymization, which means that directly identifying data of subjects (e.g. names and addresses) is stored separately from data which is primarily needed for scientific analyses. Discussion: When (authorized) re-identification of subjects is not an exceptional but a common procedure, e.g. due to longitudinal data collection, implementing pseudonymization can significantly increase the complexity of software solutions. For example, data stored in distributed databases, need to be dynamically combined with each other, which requires additional interfaces for communicating between the various subsystems. This increased complexity may lead to new attack vectors for intruders. Obviously, this is in contrast to the objective of improving data protection. What is lacking is a standardized process of evaluating and reporting risks, threats and countermeasures, which can be used to test whether integrating pseudonymization methods into data collection systems actually improves upon the degree of protection provided by system designs that simply follow common IT security best practices and implement fine-grained role-based access control models. To demonstrate that the methods used to describe systems employing pseudonymized data management are currently heterogeneous and ad-hoc, we examined the extent to which twelve recent studies address each of the six basic security properties defined by the International Organization for Standardization (ISO) standard 27,000. We show inconsistencies across the studies, with most of them failing to mention one or more security properties. Conclusion: We discuss the degree of privacy protection provided by implementing pseudonymization into research data collection processes. We conclude that (1) more research is needed on the interplay of pseudonymity, information security and data protection, (2) problem-specific guidelines for evaluating and reporting risks, threats and countermeasures should be developed and that (3) future work on pseudonymized research data collection should include the results of such structured and integrated analyses.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Kohlmayer, F., Lautenschläger, R., & Prasser, F. (2019). Pseudonymization for research data collection: Is the juice worth the squeeze? BMC Medical Informatics and Decision Making, 19(1). https://doi.org/10.1186/s12911-019-0905-x

Readers' Seniority

PhD / Post grad / Masters / Doc 19

56%

Researcher 7

21%

Professor / Associate Prof. 4

12%

Lecturer / Post doc 4

12%

Readers' Discipline

Computer Science 16

50%

Medicine and Dentistry 8

25%

Engineering 4

13%

Business, Management and Accounting 4

13%

Pseudonymization for research data collection: Is the juice worth the squeeze?

Abstract

References Powered by Scopus

Robust de-anonymization of large sparse datasets

Access Control: Principles and Practice

A privacy threat analysis framework: Supporting the elicitation and fulfillment of privacy requirements

Cited by Powered by Scopus

An Overview of Federated Deep Learning Privacy Attacks and Defensive Strategies

Before and after enforcement of GDPR: Personal data protection requests received by croatian personal data protection agency from academic and research institutions

Is there a civic duty to support medical AI development by sharing electronic health records?

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline