Distilling knowledge in causal inference for unbiased visual question answering

6Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

Current Visual Question Answering (VQA) models mainly explore the statistical correlations between answers and questions, which fail to capture the relationship between the visual information and answers. The performance dramatically decreases when the distribution of handled data is different from the training data. Towards this end, this paper proposes a novel unbiased VQA model by exploring the Casual Inference with Knowledge Distillation (CIKD) to reduce the influence of bias. Specifically, the causal graph is first constructed to explore the counterfactual causality and infer the casual target based on the causal effect, which well reduces the bias from questions and obtain answers without training. Then knowledge distillation is leveraged to transfer the knowledge of the inferred casual target to the conventional VQA model. It makes the proposed method enable to handle both the biased data and standard data. To address the problem of the bad bias from the knowledge distillation, the ensemble learning is introduced based on the hypothetical bias reason. Experiments are conducted to show the performance of the proposed method. The significant improvements over the state-of-the-art methods on the VQA-CP v2 dataset well validate the contributions of this work.

Cite

CITATION STYLE

APA

Pan, Y., Li, Z., Zhang, L., & Tang, J. (2021). Distilling knowledge in causal inference for unbiased visual question answering. In Proceedings of the 2nd ACM International Conference on Multimedia in Asia, MMAsia 2020. Association for Computing Machinery, Inc. https://doi.org/10.1145/3444685.3446256

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free