Revealing the structure of medical dictations with conditional random fields

Jeremy Jancsary; Johannes Matiasek; Harald Trost

Conference ProceedingsOPEN ACCESS

Revealing the structure of medical dictations with conditional random fields

EMNLP 2008 - 2008 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference: A Meeting of SIGDAT, a Special Interest Group of the ACL (2008) 1-10

DOI: 10.3115/1613715.1613717

13Citations

83Readers

Abstract

Automatic processing of medical dictations poses a significant challenge. We approach the problem by introducing a statistical framework capable of identifying types and boundaries of sections, lists and other structures occurring in a dictation, thereby gaining explicit knowledge about the function of such elements. Training data is created semi-automatically by aligning a parallel corpus of corrected medical reports and corresponding transcripts generated via automatic speech recognition. We highlight the properties of our statistical framework, which is based on conditional random fields (CRFs) and implemented as an efficient, publicly available toolkit. Finally, we show that our approach is effective both under ideal conditions and for real-life dictation involving speech recognition errors and speech-related phenomena such as hesitation and repetitions. © 2008 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Jancsary, J., Matiasek, J., & Trost, H. (2008). Revealing the structure of medical dictations with conditional random fields. In EMNLP 2008 - 2008 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference: A Meeting of SIGDAT, a Special Interest Group of the ACL (pp. 1–10). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1613715.1613717

Revealing the structure of medical dictations with conditional random fields

Abstract

Cite

Register to see more suggestions