Multimodal Temporal Attention in Sentiment Analysis

Yu He; Licai Sun; Zheng Lian; Bin Liu; Jianhua Tao; Meng Wang; Yuan Cheng

Conference ProceedingsOPEN ACCESS

Multimodal Temporal Attention in Sentiment Analysis

MuSe 2022 - Proceedings of the 3rd International Multimodal Sentiment Analysis Workshop and Challenge (2022) 61-66

DOI: 10.1145/3551876.3554811

19Citations

17Readers

Abstract

In this paper, we present the solution to the MuSe-Stress sub-challenge in the MuSe 2022 Multimodal Sentiment Analysis Challenge. The task of MuSe-Stress is to predict a time-continuous value (i.e., physiological arousal and valence) based on multimodal data of audio, visual, text, and physiological signals. In this competition, we find that multimodal fusion has good performance for physiological arousal on the validation set, but poor prediction performance on the test set. We believe that problem may be due to the over-fitting caused by the model's over-reliance on some specific modal features. To deal with the above problem, we propose Multimodal Temporal Attention (MMTA), which considers the temporal effects of all modalities on each unimodal branch, realizing the interaction between unimodal branches and adaptive inter-modal balance. The concordance correlation coefficient (CCC) of physiological arousal and valence are 0.6818 with MMTA and 0.6841 with early fusion, respectively, both ranking Top 1, outperforming the baseline system by a large margin (i.e., 0.4761 and 0.4931) on the test set.

Author supplied keywords

Cite

CITATION STYLE

APA

He, Y., Sun, L., Lian, Z., Liu, B., Tao, J., Wang, M., & Cheng, Y. (2022). Multimodal Temporal Attention in Sentiment Analysis. In MuSe 2022 - Proceedings of the 3rd International Multimodal Sentiment Analysis Workshop and Challenge (pp. 61–66). Association for Computing Machinery, Inc. https://doi.org/10.1145/3551876.3554811

Multimodal Temporal Attention in Sentiment Analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions