The University of Edinburgh's submissions to the WMT19 news translation task

22Citations
Citations of this article
93Readers
Mendeley users who have this article in their library.

Abstract

The University of Edinburgh participated in the WMT19 Shared Task on News Translation in six language directions: English↔Gujarati, English↔Chinese, German→English, and English→Czech. For all translation directions, we created or used back-translations of monolingual data in the target language as additional synthetic training data. For English↔Gujarati, we also explored semi-supervised MT with cross-lingual language model pre-training, and translation pivoting through Hindi. For translation to and from Chinese, we investigated character-based tokenisation vs. sub-word segmentation of Chinese text. For German→English, we studied the impact of vast amounts of back-translated training data on translation quality, gaining a few additional insights over Edunov et al. (2018). For English→Czech, we compared different pre-processing and tokenisation regimes.

Cite

CITATION STYLE

APA

Bawden, R., Bogoychev, N., Germann, U., Grundkiewicz, R., Kirefu, F., Barone, A. V. M., & Birch, A. (2019). The University of Edinburgh’s submissions to the WMT19 news translation task. In WMT 2019 - 4th Conference on Machine Translation, Proceedings of the Conference (Vol. 2, pp. 103–115). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-5304

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free