Abstract
Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language. In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. We introduce a novel model, termed Tensor Fusion Network, which learns both such dynamics end-to-end. The proposed approach is tailored for the volatile nature of spoken language in online videos as well as accompanying gestures and voice. In the experiments, our model outperforms state-ofthe-art approaches for both multimodal and unimodal sentiment analysis.
Cite
CITATION STYLE
Zadeh, A., Chen, M., Cambria, E., Poria, S., & Morency, L. P. (2017). Tensor fusion network for multimodal sentiment analysis. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 1103–1114). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d17-1115
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.