Debiasing Multimodal Sarcasm Detection with Contrastive Learning

10Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Despite commendable achievements made by existing work, prevailing multimodal sarcasm detection studies rely more on textual content over visual information. It unavoidably induces spurious correlations between textual words and labels, thereby significantly hindering the models’ generalization capability. To address this problem, we define the task of out-of-distribution (OOD) multimodal sarcasm detection, which aims to evaluate models’ generalizability when the word distribution is different in training and testing settings. Moreover, we propose a novel debiasing multimodal sarcasm detection framework with contrastive learning, which aims to mitigate the harmful effect of biased textual factors for robust OOD generalization. In particular, we first design counterfactual data augmentation to construct the positive samples with dissimilar word biases and negative samples with similar word biases. Subsequently, we devise an adapted debiasing contrastive learning mechanism to empower the model to learn robust task-relevant features and alleviate the adverse effect of biased words. Extensive experiments show the superiority of the proposed framework.

References Powered by Scopus

Deep residual learning for image recognition

174386Citations
N/AReaders
Get full text

Long Short-Term Memory

76964Citations
N/AReaders
Get full text

Momentum Contrast for Unsupervised Visual Representation Learning

9276Citations
N/AReaders
Get full text

Cited by Powered by Scopus

MV-BART: Multi-view BART for Multi-modal Sarcasm Detection

2Citations
N/AReaders
Get full text

Multimodal dual perception fusion framework for multimodal affective analysis

1Citations
N/AReaders
Get full text

Counterfactually Augmented Event Matching for De-biased Temporal Sentence Grounding

1Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Jia, M., Xie, C., & Jing, L. (2024). Debiasing Multimodal Sarcasm Detection with Contrastive Learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, pp. 18354–18362). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v38i16.29795

Readers over time

‘2402468

Save time finding and organizing research with Mendeley

Sign up for free
0