The attention layer has become a prevalent component in improving the effectiveness of neural network models for NLP tasks. Figuring out why attention is effective and its interpretability has attracted a widespread deliberation. Current studies mostly investigate the effect of attention mechanism based on the attention distribution it generates with one single neural network structure. However they do not consider the changes in semantic capability of different components in the model due to the attention mechanism, which can vary across different network structures. In this paper, we propose a comprehensive analytical framework that exploits a convex hull representation of sequence semantics in an n-dimensional Semantic Euclidean Space and defines a series of indicators to capture the impact of attention on sequence semantics. Through a series of experiments on various NLP tasks and three representative recurrent units, we analyze why and how attention benefits the semantic capacity of different types of recurrent neural networks based on the indicators defined in the proposed framework.
CITATION STYLE
Zhang, C., Li, Q., Hua, L., & Song, D. (2021). How does Attention Affect the Model? In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 256–268). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.22
Mendeley helps you to discover research relevant for your work.