Progressive Multi-Granularity Training for Non-Autoregressive Translation

35Citations
Citations of this article
64Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Non-autoregressive translation (NAT) significantly accelerates the inference process via predicting the entire target sequence. However, recent studies show that NAT is weak at learning high-mode of knowledge such as one-to-many translations. We argue that modes can be divided into various granularities which can be learned from easy to hard. In this study, we empirically show that NAT models are prone to learn fine-grained lower-mode knowledge, such as words and phrases, compared with sentences. Based on this observation, we propose progressive multi-granularity training for NAT. More specifically, to make the most of the training data, we break down the sentence-level examples into three types, i.e. words, phrases, sentences, and with the training goes, we progressively increase the granularities. Experiments on Romanian-English, English-German, Chinese-English and Japanese-English demonstrate that our approach improves the phrase translation accuracy and model reordering ability, therefore resulting in better translation quality against strong NAT baselines. Also, we show that more deterministic fine-grained knowledge can further enhance performance.

Cite

CITATION STYLE

APA

Ding, L., Wang, L., Liu, X., Wong, D. F., Tao, D., & Tu, Z. (2021). Progressive Multi-Granularity Training for Non-Autoregressive Translation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 2797–2803). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.247

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free