Automatically predicting and understanding human emotional reactions have wide applications in human-computer interaction. In this paper, we present our solutions to the MuSe-Reaction sub-challenge in MuSe 2022. The task of this sub-challenge is to predict the intensity of 7 emotional expressions from human reactions to a wide range of emotionally evocative stimuli. Specifically, we design an end-to-end model, which is composed of a Spatio-Temporal Transformer for dynamic facial representation learning and a multi-label graph convolutional network for emotion dependency modeling.We also explore the effects of a temporal model with a variety of features from acoustic and visual modalities. Our proposed method achieves mean Pearson's correlation coefficient of 0.3375 on the test set of MuSe-Reaction, which outperforms the baseline system(i.e., 0.2801) by a large margin.
CITATION STYLE
Wang, K., Lian, Z., Sun, L., Liu, B., Tao, J., & Fan, Y. (2022). Emotional Reaction Analysis based on Multi-Label Graph Convolutional Networks and Dynamic Facial Expression Recognition Transformer. In MuSe 2022 - Proceedings of the 3rd International Multimodal Sentiment Analysis Workshop and Challenge (pp. 75–80). Association for Computing Machinery, Inc. https://doi.org/10.1145/3551876.3554810
Mendeley helps you to discover research relevant for your work.