Neural machine translation for English-Kazakh with morphological segmentation and synthetic data

8Citations
Citations of this article
82Readers
Mendeley users who have this article in their library.

Abstract

This paper presents the systems submitted by the University of Groningen to the English-Kazakh language pair (both translation directions) for the WMT 2019 news translation task. We explore potential benefits from using (i) morphological segmentation (both unsupervised and rule-based), given the agglutinative nature of Kazakh, (ii) data from two additional languages (Turkish and Russian), given the scarcity of English-Kazakh data, and (iii) synthetic data, both for the source and for the target language. Our best submissions ranked second for Kazakh→English and third for English→Kazakh in terms of the BLEU automatic evaluation metric.

Cite

CITATION STYLE

APA

Toral, A., Edman, L., Yeshmagambetova, G., & Spenader, J. (2019). Neural machine translation for English-Kazakh with morphological segmentation and synthetic data. In WMT 2019 - 4th Conference on Machine Translation, Proceedings of the Conference (Vol. 2, pp. 386–392). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-5343

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free