Bridging by word: Image-grounded vocabulary construction for visual captioning

Zhihao Fan; Zhongyu Wei; Siyuan Wang; Xuanjing Huang

Conference ProceedingsOPEN ACCESS

Bridging by word: Image-grounded vocabulary construction for visual captioning

ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (2020) 6514-6524

DOI: 10.18653/v1/p19-1652

16Citations

145Readers

Abstract

Existing research for visual captioning usually employs a CNN-RNN architecture that combines a CNN for image encoding with a RNN for caption generation, where the vocabulary is constructed from the entire training dataset as the decoding space. Such approaches typically suffer from the problem of generating N-grams which occur frequently in the training set but are irrelevant to the given image. To tackle this problem, we propose to construct an image-grounded vocabulary that leverages image semantics for more effective caption generation. More concretely, a two-step approach is proposed to construct the vocabulary by incorporating both visual information and relationships among words. Two strategies are then explored to utilize the constructed vocabulary for caption generation. One constrains the generator to select words from the image-grounded vocabulary only and the other integrates the vocabulary information into the RNN cell during the caption generation process. Experimental results on two public datasets show the effectiveness of our framework compared to state-of-the-art models. Our code is available on Github1

Cite

CITATION STYLE

APA

Fan, Z., Wei, Z., Wang, S., & Huang, X. (2020). Bridging by word: Image-grounded vocabulary construction for visual captioning. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 6514–6524). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p19-1652

Bridging by word: Image-grounded vocabulary construction for visual captioning

Abstract

Cite

Register to see more suggestions