Comparing speech recognition services for HCI applications in behavioral health

3Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Behavioral health conditions such as depression and anxiety are a global concern, and there is growing interest in employing speech technology to screen and monitor patients remotely. Language modeling approaches require automatic speech recognition (ASR) and multiple privacy-compliant ASR services are commercially available. We use a corpus of over 60 hours of speech from a behavioral health task, and compare ASR performance for four commercial vendors. We expected similar performance, but found large differences between the top and next-best performer, for both mobile (48% relative WER increase) and laptop (67% relative WER increase) data. Results suggest the importance of benchmarking ASR systems in this domain. Additionally we find that WER is not systematically related to depression itself. Performance is however affected by diverse audio quality from users' personal devices, and possibly from the overall style of speech in this domain.

Cite

CITATION STYLE

APA

Chlebek, P., Shriberg, E., Lu, Y., Rutowski, T., Harati, A., & Oliveira, R. (2020). Comparing speech recognition services for HCI applications in behavioral health. In UbiComp/ISWC 2020 Adjunct - Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers (pp. 483–487). Association for Computing Machinery. https://doi.org/10.1145/3410530.3414372

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free