Automatic speech recognition (ASR) is a crucial step in many natural language processing (NLP) applications, as often available data consists mainly of raw speech. Since the result of the ASR step is considered as a meaningful, informative input to later steps in the NLP pipeline, it is important to understand the behavior and failure mode of this step. In this work, we analyze the quality of ASR in the psychotherapy domain, using motivational interviewing conversations between therapists and clients. We conduct domain agnostic and domain-relevant evaluations using evaluation metrics and also identify domain-relevant keywords in the ASR output. Moreover, we empirically study the effect of mixing ASR and manual data during the training of a downstream NLP model, and also demonstrate how additional local context can help alleviate the error introduced by noisy ASR transcripts.
CITATION STYLE
Min, D. J., Pérez-Rosas, V., & Mihalcea, R. (2021). Evaluating Automatic Speech Recognition Quality and Its Impact on Counselor Utterance Coding. In Computational Linguistics and Clinical Psychology: Improving Access, CLPsych 2021 - Proceedings of the 7th Workshop, in conjunction with NAACL 2021 (pp. 159–168). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.clpsych-1.18
Mendeley helps you to discover research relevant for your work.