A Hybridized Deep Learning Method for Bengali Image Captioning

18Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

Abstract

An omnipresent challenging research topic in computer vision is the generation of captions from an input image. Previously, numerous experiments have been conducted on image captioning in English but the generation of the caption from the image in Bengali is still sparse and in need of more refining. Only a few papers till now have worked on image captioning in Bengali. Hence, we proffer a standard strategy for Bengali image caption generation on two different sizes of the Flickr8k dataset and BanglaLekha dataset which is the only publicly available Bengali dataset for image captioning. Afterward, the Bengali captions of our model were compared with Bengali captions generated by other researchers using different architectures. Additionally, we employed a hybrid approach based on InceptionResnetV2 or Xception as Convolution Neural Network and Bidirectional Long Short-Term Memory or Bidirectional Gated Recurrent Unit on two Bengali datasets. Furthermore, a different combination of word embedding was also adapted. Lastly, the performance was evaluated using Bilingual Evaluation Understudy and proved that the proposed model indeed performed better for the Bengali dataset consisting of 4000 images and the BanglaLekha dataset.

References Powered by Scopus

Long Short-Term Memory

77546Citations
N/AReaders
Get full text

GloVe: Global vectors for word representation

27020Citations
N/AReaders
Get full text

Xception: Deep learning with depthwise separable convolutions

11634Citations
N/AReaders
Get full text

Cited by Powered by Scopus

An attention-based hybrid deep learning approach for bengali video captioning

11Citations
N/AReaders
Get full text

Bornon: Bengali Image Captioning with Transformer-Based Deep Learning Approach

7Citations
N/AReaders
Get full text

A Visual Attention-Based Model for Bengali Image Captioning

4Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Humaira, M., Paul, S., Jim, M. A. R. K., Ami, A. S., & Shah, F. M. (2021). A Hybridized Deep Learning Method for Bengali Image Captioning. International Journal of Advanced Computer Science and Applications, 12(2), 698–707. https://doi.org/10.14569/IJACSA.2021.0120287

Readers over time

‘21‘22‘23‘24036912

Readers' Seniority

Tooltip

Lecturer / Post doc 2

33%

PhD / Post grad / Masters / Doc 2

33%

Researcher 2

33%

Readers' Discipline

Tooltip

Computer Science 7

88%

Engineering 1

13%

Article Metrics

Tooltip
Social Media
Shares, Likes & Comments: 103

Save time finding and organizing research with Mendeley

Sign up for free
0