Supervised Deep Learning Techniques for Image Description: A Systematic Review

Marco López-Sánchez; Betania Hernández-Ocaña; Oscar Chávez-Bosquez; José Hernández-Torruco

ArticleOPEN ACCESS

Supervised Deep Learning Techniques for Image Description: A Systematic Review

Entropy

DOI: 10.3390/e25040553

5Citations

11Readers

Abstract

Automatic image description, also known as image captioning, aims to describe the elements included in an image and their relationships. This task involves two research fields: computer vision and natural language processing; thus, it has received much attention in computer science. In this review paper, we follow the Kitchenham review methodology to present the most relevant approaches to image description methodologies based on deep learning. We focused on works using convolutional neural networks (CNN) to extract the characteristics of images and recurrent neural networks (RNN) for automatic sentence generation. As a result, 53 research articles using the encoder-decoder approach were selected, focusing only on supervised learning. The main contributions of this systematic review are: (i) to describe the most relevant image description papers implementing an encoder-decoder approach from 2014 to 2022 and (ii) to determine the main architectures, datasets, and metrics that have been applied to image description.

Author supplied keywords

Cite

CITATION STYLE

APA

López-Sánchez, M., Hernández-Ocaña, B., Chávez-Bosquez, O., & Hernández-Torruco, J. (2023, April 1). Supervised Deep Learning Techniques for Image Description: A Systematic Review. Entropy. MDPI. https://doi.org/10.3390/e25040553

Supervised Deep Learning Techniques for Image Description: A Systematic Review

Abstract

Author supplied keywords

Cite

Register to see more suggestions