Toward understanding the effectiveness of attention mechanism

8Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Attention mechanism (AM) is a widely used method for improving the performance of convolutional neural networks (CNNs) on computer vision tasks. Despite its pervasiveness, we have a poor understanding of what its effectiveness stems from. It is popularly believed that its effectiveness stems from the visual attention explanation, i.e., attention weights indicate the importance of feature and AM advocates focusing on the important part of an input image rather than ingesting the entire input. However, we find only a weak consistency exists between the attention weights of features and their importance. We verify the feature map multiplication that brings about high-order non-linearity into CNNs is crucial for the effectiveness of AM. Furthermore, we show an essential impact of feature map multiplication on the learned surfaces of CNNs. With the high-order non-linearity, feature map multiplication plays a regularization role on CNNs, which makes the learned curves smoother and more stable in-between real samples (test/training samples in datasets). Thus, compared to vanilla CNNs, CNNs equipped with AM are more robust to noises and yield smaller model sensitivity scores, which is the reason for their better performance.

Cite

CITATION STYLE

APA

Ye, X., He, Z., Heng, W., & Li, Y. (2023). Toward understanding the effectiveness of attention mechanism. AIP Advances, 13(3). https://doi.org/10.1063/5.0141666

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free