The work presented in this paper introduces a new model for emotion recognition from videos, Tandem Modelling (TM). The core of the proposed system consists of a hybrid neural network model that joins two feed-forward neural net models with a bottle-neck connection layer (BNL). Specifically, appearance and motion of each video sequence are encoded using a hand-crafted spatio-temporal descriptor. The obtained features are propagated through a not fully-connected neural net (NFCN) and a new tandem features are generated from the BNL. In a second level, a fully connected network (FCN) is trained with the so-extracted features to encode one of the six basic emotional states (anger, disgust, fear, happiness, sadness and surprise) with the neutral state. The classification results reached by the proposed TM show superiority over state-of-the-art approaches.
CITATION STYLE
Kasraoui, S., Lachiri, Z., & Madani, K. (2019). Tandem Modelling Based Emotion Recognition in Videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11507 LNCS, pp. 325–336). Springer Verlag. https://doi.org/10.1007/978-3-030-20518-8_28
Mendeley helps you to discover research relevant for your work.