Interpreting Positional Information in Perspective of Word Order

0Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The attention mechanism is a powerful and effective method utilized in natural language processing. However, it has been observed that this method is insensitive to positional information. Although several studies have attempted to improve positional encoding and investigate the influence of word order perturbation, it remains unclear how positional encoding impacts NLP models from the perspective of word order. In this paper, we aim to shed light on this problem by analyzing the working mechanism of the attention module and investigating the root cause of its inability to encode positional information. Our hypothesis is that the insensitivity can be attributed to the weight sum operation utilized in the attention module. To verify this hypothesis, we propose a novel weight concatenation operation and evaluate its efficacy in neural machine translation tasks. Our enhanced experimental results not only reveal that the proposed operation can effectively encode positional information but also confirm our hypothesis.

Cite

CITATION STYLE

APA

Zhang, X., Liu, R., Liu, J., & Liang, X. (2023). Interpreting Positional Information in Perspective of Word Order. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 9600–9613). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-long.534

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free