ConKI: Contrastive Knowledge Injection for Multimodal Sentiment Analysis

18Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Multimodal Sentiment Analysis leverages multimodal signals to detect the sentiment of a speaker. Previous approaches concentrate on performing multimodal fusion and representation learning based on general knowledge obtained from pretrained models, which neglects the effect of domain-specific knowledge. In this paper, we propose Contrastive Knowledge Injection (ConKI) for multimodal sentiment analysis, where specific-knowledge representations for each modality can be learned together with general knowledge representations via knowledge injection based on an adapter architecture. In addition, ConKI uses a hierarchical contrastive learning procedure performed between knowledge types within every single modality, across modalities within each sample, and across samples to facilitate the effective learning of the proposed representations, hence improving multimodal sentiment predictions. The experiments on three popular multimodal sentiment analysis benchmarks show that ConKI outperforms all prior methods on a variety of performance metrics.

Cite

CITATION STYLE

APA

Yu, Y., Zhao, M., Qi, S. A., Sun, F., Wang, B., Guo, W., … Niu, D. (2023). ConKI: Contrastive Knowledge Injection for Multimodal Sentiment Analysis. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 13610–13624). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.860

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free