Cascade Semantic Fusion for Image Captioning

15Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Recent advances in deep visual attention methods accelerate greatly the research of image captioning. However, how to leverage hand-crafted features or deep features for the encoder of image captioning is not fully explored, due to the difficulty in finding a kind of all-purpose features to entail a set of visual semantics. In this paper, we introduce a cascade semantic fusion architecture (CSF) to mine the representative features to encode image content through attention mechanism without bells and whistles. Specifically, the CSF benefits from three types of visual attention semantics, including object-level, image-level, and spatial attention features, in a novel three-stage cascade manner. In the first stage, object-level attention features are extracted to capture the detailed contents of the objects based on the pretrained detector. Then, the middle stage devises a fusion module to merge object-level attention features with spatial features, thereby inducing image-level attention features to enrich the context information around the objects. In the last stage, spatial attention features are learned to unveil the salient region representation as a complement to two previously learned attention features. In a nutshell, we integrate attention mechanism with three types of features to organize context knowledge about images from different aspects. The empirical analysis shows that the CSF can assist image captioning model in selecting the object regions of interest. The experiments of image captioning on MSCOCO dataset show the efficacy of our semantic fusion architecture in depicting image content.

Cite

CITATION STYLE

APA

Wang, S., Lan, L., Zhang, X., Dong, G., & Luo, Z. (2019). Cascade Semantic Fusion for Image Captioning. IEEE Access, 7, 66680–66688. https://doi.org/10.1109/ACCESS.2019.2917979

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free