Improved Blending Attention Mechanism in Visual Question Answering

Siyu Lu; Yueming Ding; Zhengtong Yin; Mingzhe Liu; Xuan Liu; Wenfeng Zheng; Lirong Yin

Journal ArticleOPEN ACCESS

Improved Blending Attention Mechanism in Visual Question Answering

Computer Systems Science and Engineering (2023) 47(1) 1149-1161

DOI: 10.32604/csse.2023.038598

5Citations

7Readers

Get full text

Abstract

Visual question answering (VQA) has attracted more and more attention in computer vision and natural language processing. Scholars are committed to studying how to better integrate image features and text features to achieve better results in VQA tasks. Analysis of all features may cause information redundancy and heavy computational burden. Attention mechanism is a wise way to solve this problem. However, using single attention mechanism may cause incomplete concern of features. This paper improves the attention mechanism method and proposes a hybrid attention mechanism that combines the spatial attention mechanism method and the channel attention mechanism method. In the case that the attention mechanism will cause the loss of the original features, a small portion of image features were added as compensation. For the attention mechanism of text features, a selfattention mechanism was introduced, and the internal structural features of sentences were strengthened to improve the overall model. The results show that attention mechanism and feature compensation add 6.1% accuracy to multimodal low-rank bilinear pooling network.

Author supplied keywords

Cite

CITATION STYLE

APA

Lu, S., Ding, Y., Yin, Z., Liu, M., Liu, X., Zheng, W., & Yin, L. (2023). Improved Blending Attention Mechanism in Visual Question Answering. Computer Systems Science and Engineering, 47(1), 1149–1161. https://doi.org/10.32604/csse.2023.038598

Improved Blending Attention Mechanism in Visual Question Answering

Abstract

Author supplied keywords

Cite

Register to see more suggestions