A Markov Chain Replacement Strategy for Surrogate Identifiers: Minimizing Re-Identification Risk While Preserving Text Reuse

0Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

“Hiding in Plain Sight” (HIPS) strategies for Personal Health Information (PHI) replace PHI with surrogate values to hinder re-identification attempts. We evaluate three different HIPS strategies for PHI replacement, a standard Consistent replacement strategy, a Random replacement strategy, and a novel Markov model strategy. We evaluate the privacy-preserving benefits and relative utility for information extraction of these strategies on both a simulated PHI distribution and real clinical corpora from two different institutions using a range of false negative error rates (FNER). The Markov strategy consistently outperformed the Consistent and Random substitution strategies on both real data and in statistical simulations. Using FNER ranging from 0.1% to 5%, PHI leakage at the document level could be reduced from 27.1% to 0.1% and from 94.2% to 57.7% with the Markov strategy versus the standard Consistent substitution strategy, at 0.1% and 0.5% FNER, respectively. Additionally, we assessed the generated corpora containing synthetic PHI for reuse using a variety of information extraction methods. Results indicate that modern deep learning methods have similar performance on all strategies, but older machine learning techniques can suffer from the change in context. Overall, a Markov surrogate generation strategy substantially reduces the chance of inadvertent PHI release.

Cite

CITATION STYLE

APA

Osborne, J. D., Trotter, A., O’Leary, T., Coffee, C., Cochran, M. D., Mansilla-Gonzalez, L., … Kennedy, R. E. (2025). A Markov Chain Replacement Strategy for Surrogate Identifiers: Minimizing Re-Identification Risk While Preserving Text Reuse. Electronics (Switzerland), 14(19). https://doi.org/10.3390/electronics14193945

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free