Semantic Relation-aware Difference Representation Learning for Change Captioning

41Citations
Citations of this article
57Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Change captioning is to describe the difference in a pair of images with a natural language sentence. In this task, the distractors, such as the illumination or viewpoint change, bring the huge challenges about learning the difference representation. In this paper, we propose a semantic relation-aware difference representation learning network to explicitly learn the difference representation in the existence of distractors. Specifically, we introduce a self-semantic relation embedding block to explore the underlying changed objects and design a cross-semantic relation measuring block to localize the real change and learn the discriminative difference representation. Besides, relying on the POS of words, we devise an attention-based visual switch to dynamically use visual information for caption generation. Extensive experiments show that our method achieves the state-of-the-art performances on CLEVR-Change and Spot-the-Diff datasets.

Cite

CITATION STYLE

APA

Tu, Y., Yao, T., Li, L., Lou, J., Gao, S., Yu, Z., & Yan, C. (2021). Semantic Relation-aware Difference Representation Learning for Change Captioning. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 63–73). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free