We present our preliminary work to determine if patient's vocal acoustic, linguistic, and facial patterns could predict clinical ratings of depression severity, namely Patient Health Questionnaire depression scale (PHQ-8). We proposed a multi-modal fusion model that combines three different modalities: audio, video, and text features. By training over the AVEC2017 dataset, our proposed model outperforms each single-modality prediction model, and surpasses the dataset baseline with a nice margin.
CITATION STYLE
Samareh, A., Jin, Y., Wang, Z., Chang, X., & Huang, S. (2018). Predicting depression severity by multi-modal feature engineering and fusion. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 8147–8148). AAAI press. https://doi.org/10.1609/aaai.v32i1.12152
Mendeley helps you to discover research relevant for your work.