The university of Passau open emotion recognition system for the multimodal emotion challenge

26Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents the University of Passau’s approaches for the Multimodal Emotion Recognition Challenge 2016. For audio signals, we exploit Bag-of-Audio-Words techniques combining Extreme Learning Machines and Hierarchical Extreme Learning Machines. For video signals, we use not only the information from the cropped face of a video frame, but also the broader contextual information from the entire frame. This information is extracted via two Convolutional Neural Networks pre-trained for face detection and object classification. Moreover, we extract facial action units, which reflect facial muscle movements and are known to be important for emotion recognition. Long Short-Term Memory Recurrent Neural Networks are deployed to exploit temporal information in the video representation. Average late fusion of audio and video systems is applied to make prediction for multimodal emotion recognition. Experimental results on the challenge database demonstrate the effectiveness of our proposed systems when compared to the baseline.

Cite

CITATION STYLE

APA

Deng, J., Cummins, N., Han, J., Xu, X., Ren, Z., Pandit, V., … Schuller, B. (2016). The university of Passau open emotion recognition system for the multimodal emotion challenge. In Communications in Computer and Information Science (Vol. 663, pp. 652–666). Springer Verlag. https://doi.org/10.1007/978-981-10-3005-5_54

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free