Image captioning with visual-semantic LSTM

Nannan Li; Zhenzhong Chen

Conference ProceedingsOPEN ACCESS

Image captioning with visual-semantic LSTM

IJCAI International Joint Conference on Artificial Intelligence (2018) 2018-July 793-799

DOI: 10.24963/ijcai.2018/110

37Citations

32Readers

Abstract

In this paper, a novel image captioning approach is proposed to describe the content of images. Inspired by the visual processing of our cognitive system, we propose a visual-semantic LSTM model to locate the attention objects with their low-level features in the visual cell, and then successively extract high-level semantic features in the semantic cell. In addition, a state perturbation term is introduced to the word sampling strategy in the REINFORCE based method to explore proper vocabularies in the training process. Experimental results on MS COCO and Flickr30K validate the effectiveness of our approach when compared to the state-of-the-art methods.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Li, N., & Chen, Z. (2018). Image captioning with visual-semantic LSTM. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2018-July, pp. 793–799). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2018/110

Readers' Seniority

PhD / Post grad / Masters / Doc 14

70%

Lecturer / Post doc 3

15%

Researcher 2

10%

Professor / Associate Prof. 1

Readers' Discipline

Computer Science 16

76%

Engineering 3

14%

Agricultural and Biological Sciences 1

Business, Management and Accounting 1

Image captioning with visual-semantic LSTM

Abstract

References Powered by Scopus

Deep residual learning for image recognition

Deep visual-semantic alignments for generating image descriptions

SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning

Cited by Powered by Scopus

Entangled transformer for image captioning

Multilayer Dense Attention Model for Image Caption

High-Order Interaction Learning for Image Captioning

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline