Multiple classifier systems for the classification of audio-visual emotional states

100Citations
Citations of this article
55Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Research activities in the field of human-computer interaction increasingly addressed the aspect of integrating some type of emotional intelligence. Human emotions are expressed through different modalities such as speech, facial expressions, hand or body gestures, and therefore the classification of human emotions should be considered as a multimodal pattern recognition problem. The aim of our paper is to investigate multiple classifier systems utilizing audio and visual features to classify human emotional states. For that a variety of features have been derived. From the audio signal the fundamental frequency, LPC- and MFCC coefficients, and RASTA-PLP have been used. In addition to that two types of visual features have been computed, namely form and motion features of intermediate complexity. The numerical evaluation has been performed on the four emotional labels Arousal, Expectancy, Power, Valence as defined in the AVEC data set. As classifier architectures multiple classifier systems are applied, these have been proven to be accurate and robust against missing and noisy data. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Glodek, M., Tschechne, S., Layher, G., Schels, M., Brosch, T., Scherer, S., … Schwenker, F. (2011). Multiple classifier systems for the classification of audio-visual emotional states. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6975 LNCS, pp. 359–368). https://doi.org/10.1007/978-3-642-24571-8_47

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free