Learning an Unsupervised and Interpretable Representation of Emotion from Speech

Siwei Wang; Catherine Soladié; Renaud Séguier

Conference Proceedings

Learning an Unsupervised and Interpretable Representation of Emotion from Speech

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12335 LNAI 636-645

DOI: 10.1007/978-3-030-60276-5_61

0Citations

1Readers

Get full text

Abstract

One of the severe obstacles to naturalistic human affective computing is that emotions are complex constructs with fuzzy boundaries and substantial individual variations. Thus, an important issue to be considered in emotion analysis is generating a person-specific representation of emotion in an unsupervised manner. This paper presents a fully unsupervised method combining autoencoder with Principle Component Analysis to build an emotion representation from speech signals. As each person has a different way of expressing emotions, this method is applied to the subject level. We also investigate the relevancy of such a representation. Experiments on Emo-DB, IEMOCAP, and SEMAINE database show that the proposed representation of emotion is invariant among subjects and similar to the representation built by psychologists, especially on the arousal dimension.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, S., Soladié, C., & Séguier, R. (2020). Learning an Unsupervised and Interpretable Representation of Emotion from Speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12335 LNAI, pp. 636–645). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60276-5_61

Learning an Unsupervised and Interpretable Representation of Emotion from Speech

Abstract

Author supplied keywords

Cite

Register to see more suggestions