Levenshtein Training for Word-level Quality Estimation

Shuoyang Ding; Marcin Junczys-Dowmunt; Matt Post; Philipp Koehn

Conference ProceedingsOPEN ACCESS

Levenshtein Training for Word-level Quality Estimation

EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (2021) 6724-6733

DOI: 10.18653/v1/2021.emnlp-main.539

4Citations

60Readers

Abstract

We propose a novel scheme to use the Levenshtein Transformer to perform the task of word-level quality estimation. A Levenshtein Transformer is a natural fit for this task: trained to perform decoding in an iterative manner, a Levenshtein Transformer can learn to post-edit without explicit supervision. To further minimize the mismatch between the translation task and the word-level QE task, we propose a two-stage transfer learning procedure on both augmented data and human post-editing data. We also propose heuristics to construct reference labels that are compatible with subword-level finetuning and inference. Results on WMT 2020 QE shared task dataset show that our proposed method has superior data efficiency under the data-constrained setting and competitive performance under the unconstrained setting.

Cite

CITATION STYLE

APA

Ding, S., Junczys-Dowmunt, M., Post, M., & Koehn, P. (2021). Levenshtein Training for Word-level Quality Estimation. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 6724–6733). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.539

Levenshtein Training for Word-level Quality Estimation

Abstract

Cite

Register to see more suggestions