Time-continuous emotion recognition using spectrogram based CNN-RNN modelling

Dmitrii Fedotov; Bobae Kim; Alexey Karpov; Wolfgang Minker

Conference Proceedings

Time-continuous emotion recognition using spectrogram based CNN-RNN modelling

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11658 LNAI 93-102

DOI: 10.1007/978-3-030-26061-3_10

1Citations

7Readers

Get full text

Abstract

In area of speech emotion recognition, hand-engineered features are traditionally used as an input. However, it requires an additional step to extract features before the prediction and prior knowledge to select feature set. Thus, recent research has been focused on approaches that predict emotions directly from speech signal to reduce the required efforts for the feature extraction and increase performance of emotion recognition system. Whereas this approach has been applied for prediction of categorical emotions, the study for prediction of continuous dimensional emotions is still rare. This paper presents a method for time-continuous prediction of emotions from speech using spectrogram. Proposed model comprises convolutional neural network (CNN) and Recurrent Neural Network with Long Short-Term Memory (RNN-LSTM). Hyperparameters of CNN are investigated to improve the performance of the our model. After finding the optimal hyperparameters, the performance of the system with waveform and spectrogram as input is compared in terms of concordance correlation coefficient (CCC). Proposed method outperforms the end-to-end emotion recognition system based on waveform and provides CCC of 0.722 predicting arousal on RECOLA database.

Author supplied keywords

Cite

CITATION STYLE

APA

Fedotov, D., Kim, B., Karpov, A., & Minker, W. (2019). Time-continuous emotion recognition using spectrogram based CNN-RNN modelling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11658 LNAI, pp. 93–102). Springer Verlag. https://doi.org/10.1007/978-3-030-26061-3_10

Time-continuous emotion recognition using spectrogram based CNN-RNN modelling

Abstract

Author supplied keywords

Cite

Register to see more suggestions