Recurrent positional embedding for neural machine translation

19Citations
Citations of this article
102Readers
Mendeley users who have this article in their library.

Abstract

In the Transformer network architecture, positional embeddings are used to encode order dependencies into the input representation. However, this input representation only involves static order dependencies based on discrete numerical information, that is, are independent of word content. To address this issue, this work proposes a recurrent positional embedding approach based on word vector. In this approach, these recurrent positional embeddings are learned by a recurrent neural network, encoding word content-based order dependencies into the input representation. They are then integrated into the existing multi-head self-attention model as independent heads or part of each head. The experimental results revealed that the proposed approach improved translation performance over that of the state-of-the-art Transformer baseline in WMT'14 English-to-German and NIST Chinese-to-English translation tasks.

Cite

CITATION STYLE

APA

Chen, K., Wang, R., Utiyama, M., & Sumita, E. (2019). Recurrent positional embedding for neural machine translation. In EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (pp. 1361–1367). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1139

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free