In this paper, we propose a new Transformer neural machine translation (NMT) model that incorporates dependency relations into self-attention on both source and target sides, dependency-based self-attention. The dependency-based self-attention is trained to attend to the modifiee for each token under constraints based on the dependency relations, inspired by linguistically-informed self-attention (LISA). While LISA was originally designed for Transformer encoder for semantic role labeling, this paper extends LISA to Transformer NMT by masking future information on words in the decoder-side dependency-based self-attention. Additionally, our dependency-based self-attention operates at subword units created by byte pair encoding. Experiments demonstrate that our model achieved a 1.0 point gain in BLEU over the baseline model on the WAT'18 Asian Scientific Paper Excerpt Corpus Japanese-to-English translation task.
CITATION STYLE
Deguchi, H., Tamura, A., & Ninomiya, T. (2019). Dependency-based self-attention for transformer NMT. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2019-September, pp. 239–246). Incoma Ltd. https://doi.org/10.26615/978-954-452-056-4_028
Mendeley helps you to discover research relevant for your work.