This paper presents the systems submitted by the University of Groningen to the English-Kazakh language pair (both translation directions) for the WMT 2019 news translation task. We explore potential benefits from using (i) morphological segmentation (both unsupervised and rule-based), given the agglutinative nature of Kazakh, (ii) data from two additional languages (Turkish and Russian), given the scarcity of English-Kazakh data, and (iii) synthetic data, both for the source and for the target language. Our best submissions ranked second for Kazakh→English and third for English→Kazakh in terms of the BLEU automatic evaluation metric.
CITATION STYLE
Toral, A., Edman, L., Yeshmagambetova, G., & Spenader, J. (2019). Neural machine translation for English-Kazakh with morphological segmentation and synthetic data. In WMT 2019 - 4th Conference on Machine Translation, Proceedings of the Conference (Vol. 2, pp. 386–392). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-5343
Mendeley helps you to discover research relevant for your work.