Image captioning with visual-semantic LSTM

37Citations
Citations of this article
32Readers
Mendeley users who have this article in their library.

Abstract

In this paper, a novel image captioning approach is proposed to describe the content of images. Inspired by the visual processing of our cognitive system, we propose a visual-semantic LSTM model to locate the attention objects with their low-level features in the visual cell, and then successively extract high-level semantic features in the semantic cell. In addition, a state perturbation term is introduced to the word sampling strategy in the REINFORCE based method to explore proper vocabularies in the training process. Experimental results on MS COCO and Flickr30K validate the effectiveness of our approach when compared to the state-of-the-art methods.

References Powered by Scopus

Deep residual learning for image recognition

176591Citations
N/AReaders
Get full text

Deep visual-semantic alignments for generating image descriptions

3830Citations
N/AReaders
Get full text

SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning

1550Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Entangled transformer for image captioning

317Citations
N/AReaders
Get full text

Multilayer Dense Attention Model for Image Caption

79Citations
N/AReaders
Get full text

High-Order Interaction Learning for Image Captioning

77Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Li, N., & Chen, Z. (2018). Image captioning with visual-semantic LSTM. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2018-July, pp. 793–799). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2018/110

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 14

70%

Lecturer / Post doc 3

15%

Researcher 2

10%

Professor / Associate Prof. 1

5%

Readers' Discipline

Tooltip

Computer Science 16

76%

Engineering 3

14%

Agricultural and Biological Sciences 1

5%

Business, Management and Accounting 1

5%

Save time finding and organizing research with Mendeley

Sign up for free