De-identification of Emergency Medical Records in French: Survey and Comparison of State-of-the-Art Automated Systems

4Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

In France, structured data from emergency room (ER) visits are aggregated at the national level to build a syndromic surveillance system for several health events. For visits motivated by a traumatic event, information on the causes are stored in free-text clinical notes. To exploit these data, an automated de-identification system guaranteeing protection of privacy is required. In this study we review available de-identification tools to de-identify free-text clinical documents in French. A key point is how to overcome the resource barrier that hampers NLP applications in languages other than English. We compare rule-based, named entity recognition, new Transformer-based deep learning and hybrid systems using, when required, a fine-tuning set of 30,000 unlabeled clinical notes. The evaluation is performed on a test set of 3,000 manually annotated notes. Hybrid systems, combining capabilities in complementary tasks, show the best performance. This work is a first step in the foundation of a national surveillance system based on the exhaustive collection of ER visits reports for automated trauma monitoring.

Cite

CITATION STYLE

APA

Bourdois, L., Avalos-Fernandez, M., Chenais, G., Thiessard, F., Revel, P., Gil-Jardiné, C., & Lagarde, E. (2021). De-identification of Emergency Medical Records in French: Survey and Comparison of State-of-the-Art Automated Systems. In Proceedings of the International Florida Artificial Intelligence Research Society Conference, FLAIRS (Vol. 34). Florida Online Journals, University of Florida. https://doi.org/10.32473/flairs.v34i1.128480

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free