Medical speech recognition: Reaching parity with humans

Erik Edwards; Wael Salloum; Greg P. Finley; James Fone; Greg Cardiff; Mark Miller; David Suendermann-Oeft

Conference Proceedings

Medical speech recognition: Reaching parity with humans

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10458 LNAI 512-524

DOI: 10.1007/978-3-319-66429-3_51

28Citations

30Readers

Get full text

Abstract

We present a speech recognition system for the medical domain whose architecture is based on a state-of-the-art stack trained on over 270 h of medical speech data and 30 million tokens of text from clinical episodes. Despite the acoustic challenges and linguistic complexity of the domain, we were able to reduce the system’s word error rate to below 16% in a realistic clinical use case. To further benchmark our system, we determined the human word error rate on a corpus covering a wide variety of speakers, working with multiple medical transcriptionists, and found that our speech recognition system performs on a par with humans.

Author supplied keywords

Cite

CITATION STYLE

APA

Edwards, E., Salloum, W., Finley, G. P., Fone, J., Cardiff, G., Miller, M., & Suendermann-Oeft, D. (2017). Medical speech recognition: Reaching parity with humans. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 512–524). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_51

Medical speech recognition: Reaching parity with humans

Abstract

Author supplied keywords

Cite

Register to see more suggestions