HI-CMLM: Improve CMLM with Hybrid Decoder Input

3Citations
Citations of this article
46Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Mask-predict CMLM (Ghazvininejad et al., 2019) has achieved stunning performance among non-autoregressive NMT models, but we find that the mechanism of predicting all of the target words only depending on the hidden state of [MASK] is not effective and efficient in initial iterations of refinement, resulting in ungrammatical repetitions and slow convergence. In this work, we mitigate this problem by combining copied source with embeddings of [MASK] in decoder. Notably. it's not a straightforward copying that is shown to be useless, but a novel heuristic hybrid strategy - fence-mask. Experimental results show that it gains consistent boosts on both WMT14 En↔De and WMT16 En↔Ro corpus by 0.5 BLEU on average, and 1 BLEU for less-informative short sentences. This reveals that incorporating additional information by proper strategies is beneficial to improve CMLM, particularly translation quality of short texts and speeding up early-stage convergence.

Cite

CITATION STYLE

APA

Wang, M., Guo, J., Wang, Y., Chen, Y., Su, C., Wei, D., … Yang, H. (2021). HI-CMLM: Improve CMLM with Hybrid Decoder Input. In INLG 2021 - 14th International Conference on Natural Language Generation, Proceedings (pp. 167–171). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.inlg-1.16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free