Before medical records can be shared outside of a hospital or medical group, all of the information that identifies the patient (called protected health information, or PHI) must be removed. In this paper, we examine different methodologies for performing de-identification annotation in order to determine which is most effective at ensuring that all identifying information is removed. We used serial (i.e., multiple annotators working in succession) and parallel (i.e., multiple annotators working independently) annotation paradigms on two different corpora, one unannotated and the other pre-annotated for PHI. Our evaluation revealed that neither annotation paradigm was superior to the other, regardless of whether the corpus was pre-annotated or unannotated.
CITATION STYLE
Stubbs, A., & Uzuner, Ö. (2017). De-identification of Medical Records Through Annotation. In Handbook of Linguistic Annotation (pp. 1433–1459). Springer Netherlands. https://doi.org/10.1007/978-94-024-0881-2_55
Mendeley helps you to discover research relevant for your work.