Adaptive Multi-attention for Image Sentence Generator Using C-LSTM

K. A. Vidhya; S. Krishnakumar; B. Cynddia

Conference Proceedings

Adaptive Multi-attention for Image Sentence Generator Using C-LSTM

Lecture Notes in Networks and Systems (2023) 448 579-592

DOI: 10.1007/978-981-19-1610-6_51

0Citations

1Readers

Get full text

Abstract

Capturing image feature and multi-object region of an image and transferring it into a Natural Language Sentence is a research issue needs to be addressed with natural language processing. Technically, the attention mechanism will force every word representation to an corresponding image region, however at times it do neglect certain words like ‘the’ in the description text, as it misleads the text interpretation. The captioning of an image involves not only detecting the features from various images, but also decoding the collaborations between the items into significant image text. The focus of the suggested work, predicts the image sentence in a more detailed way for every region/frame of an image. To overcome, an image feature extraction is carried out using CNN and LSTM for the image text generation with the help of adaptive attention mechanism, which will be add in the layer of LSTM to predict better image sentence is constructed. The above mentioned deep network methods have been analyzed using two output combination. Experiments have been implemented using Flickr8k dataset. The implementation analysis illustrates that adaptive attention performs significantly better than without adaptive attention of image sentence model and generates more meaningful captions compared to any of the individual models used. From the results on test images, the suggested network gives the accuracy, bleu score with and without using adaptive attention in the LSTM of 81.53, 61.94 and 73.53, 57.94%.

Author supplied keywords

Cite

CITATION STYLE

APA

Vidhya, K. A., Krishnakumar, S., & Cynddia, B. (2023). Adaptive Multi-attention for Image Sentence Generator Using C-LSTM. In Lecture Notes in Networks and Systems (Vol. 448, pp. 579–592). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-19-1610-6_51

Adaptive Multi-attention for Image Sentence Generator Using C-LSTM

Abstract

Author supplied keywords

Cite

Register to see more suggestions