Multi-Attention Module for Dynamic Facial Emotion Recognition

8Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

Video-based dynamic facial emotion recognition (FER) is a challenging task, as one must capture and distinguish tiny facial movements representing emotional changes while ignoring the facial differences of different objects. Recent state-of-the-art studies have usually adopted more complex methods to solve this task, such as large-scale deep learning models or multimodal analysis with reference to multiple sub-models. According to the characteristics of the FER task and the shortcomings of existing methods, in this paper we propose a lightweight method and design three attention modules that can be flexibly inserted into the backbone network. The key information for the three dimensions of space, channel, and time is extracted by means of convolution layer, pooling layer, multi-layer perception (MLP), and other approaches, and attention weights are generated. By sharing parameters at the same level, the three modules do not add too many network parameters while enhancing the focus on specific areas of the face, effective feature information of static images, and key frames. The experimental results on CK+ and eNTERFACE’05 datasets show that this method can achieve higher accuracy.

Cite

CITATION STYLE

APA

Zhi, J., Song, T., Yu, K., Yuan, F., Wang, H., Hu, G., & Yang, H. (2022). Multi-Attention Module for Dynamic Facial Emotion Recognition. Information (Switzerland), 13(5). https://doi.org/10.3390/info13050207

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free