Generalization of Deep Acoustic and NLP Models for Large-Scale Depression Screening

3Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Depression is a costly and underdiagnosed global health concern, and there is a great need for improved patient screening. Speech technology offers promise for remote screening, but must perform robustly across patient and environmental variables. This chapter describes two deep learning models that achieve excellent performance in this regard. An acoustic model uses transfer learning from an automatic speech recognition (ASR) task. A natural language processing (NLP) model uses transfer learning from a language modeling task. Both models are studied using data from over 10, 000 unique users who interacted with human-machine applications using conversational speech. Results for binary classification on a large test set show AUC performance of 0.79 and 0.83 for the acoustic and NLP models, respectively. RMSE for a regression task is 4.70 for the acoustic model and 4.27 for the NLP model. Further analysis of performance as a function of test subset characteristics indicates that the models are generally robust over speaker and session variables. It is concluded that both acoustic and NLP-based models have potential for use in generalized automated depression screening.

Cite

CITATION STYLE

APA

Harati, A., Rutowski, T., Lu, Y., Chlebek, P., Oliveira, R., Shriberg, E., & Lin, D. (2022). Generalization of Deep Acoustic and NLP Models for Large-Scale Depression Screening. In Biomedical Sensing and Analysis: Signal Processing in Medicine and Biology (pp. 99–132). Springer International Publishing. https://doi.org/10.1007/978-3-030-99383-2_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free