Target Oriented Data Generation for Quality Estimation of Machine Translation

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Quality estimation (QE) is a non-trivial issue for machine translation (MT) and the neural approach appears a promising solution to this task. Annotating QE training corpora is a costly process but necessary for supervised QE systems. To provide informative large scale training data for the MT quality estimation model, this paper proposes an approach to generate pseudo QE training data. By leveraging the provided labeled corpus in this task, our method generates pseudo training samples with a purpose of similar distribution of translation error of the labeled corpus. It also describes a sentence specific data expansion strategy to incrementally boost the model performance. The experiments on the different open datasets and models confirm the effectiveness of the method, and indicate that our proposed method can significantly improve the QE performance.

Cite

CITATION STYLE

APA

Wu, H., Yang, M., Wang, J., Zhu, J., & Zhao, T. (2019). Target Oriented Data Generation for Quality Estimation of Machine Translation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11838 LNAI, pp. 393–405). Springer. https://doi.org/10.1007/978-3-030-32233-5_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free