Attention-Based Deep Learning Model for Image Captioning: A Comparative Study

Phyu Phyu Khaing; May The Yu

Journal ArticleOPEN ACCESS

Attention-Based Deep Learning Model for Image Captioning: A Comparative Study

International Journal of Image, Graphics and Signal Processing (2019) 11(6) 1-8

DOI: 10.5815/ijigsp.2019.06.01

6Citations

9Readers

Abstract

Image captioning is the description generated from images. Generating the caption of an image is one part of computer vision or image processing from artificial intelligence (AI). Image captioning is also the bridge between the vision process and natural language process. In image captioning, there are two parts: sentence based generation and single word generation. Deep Learning has become the main driver of many new applications and is also much more accessible in terms of the learning curve. Image captioning by applying deep learning model can enhance the description accuracy. Attention mechanisms are the upward trend in the model of deep learning for image caption generation. This paper proposes the comparative study for attention-based deep learning model for image captioning. This presents the basic analyzing techniques for performance, advantages, and weakness. This also discusses the datasets for image captioning and the evaluation metrics to test the accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Khaing, P. P., & Yu, M. T. (2019). Attention-Based Deep Learning Model for Image Captioning: A Comparative Study. International Journal of Image, Graphics and Signal Processing, 11(6), 1–8. https://doi.org/10.5815/ijigsp.2019.06.01

Attention-Based Deep Learning Model for Image Captioning: A Comparative Study

Abstract

Author supplied keywords

Cite

Register to see more suggestions