On the relation between position information and sentence length in neural machine translation

Masato Neishi; Naoki Yoshinaga

Conference ProceedingsOPEN ACCESS

On the relation between position information and sentence length in neural machine translation

CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference (2019) 328-338

DOI: 10.18653/v1/k19-1031

35Citations

84Readers

Abstract

Long sentences have been one of the major challenges in neural machine translation (NMT). Although some approaches such as the attention mechanism have partially remedied the problem, we found that the current standard NMT model, Transformer, has difficulty in translating long sentences compared to the former standard, Recurrent Neural Network (RNN)-based model. One of the key differences of these NMT models is how the model handles position information which is essential to process sequential data. In this study, we focus on the position information type of NMT models, and hypothesize that relative position is better than absolute position. To examine the hypothesis, we propose RNN-Transformer which replaces positional encoding layer of Transformer by RNN, and then compare RNN-based model and four variants of Transformer. Experiments on ASPEC English-to-Japanese and WMT2014 English-to-German translation tasks demonstrate that relative position helps translating sentences longer than those in the training data. Further experiments on length-controlled training data reveal that absolute position actually causes overfitting to the sentence length.

Cite

CITATION STYLE

APA

Neishi, M., & Yoshinaga, N. (2019). On the relation between position information and sentence length in neural machine translation. In CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference (pp. 328–338). Association for Computational Linguistics. https://doi.org/10.18653/v1/k19-1031

On the relation between position information and sentence length in neural machine translation

Abstract

Cite

Register to see more suggestions