Viterbi Decoding of Directed Acyclic Transformer for Non-Autoregressive Machine Translation

15Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Non-autoregressive models achieve significant decoding speedup in neural machine translation but lack the ability to capture sequential dependency. Directed Acyclic Transformer (DA-Transformer) was recently proposed to model sequential dependency with a directed acyclic graph. Consequently, it has to apply a sequential decision process at inference time, which harms the global translation accuracy. In this paper, we present a Viterbi decoding framework for DA-Transformer, which guarantees to find the joint optimal solution for the translation and decoding path under any length constraint. Experimental results demonstrate that our approach consistently improves the performance of DA-Transformer while maintaining a similar decoding speedup.

Cite

CITATION STYLE

APA

Shao, C., Ma, Z., & Feng, Y. (2022). Viterbi Decoding of Directed Acyclic Transformer for Non-Autoregressive Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 4419–4426). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.296

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free