Abstract
Research in domain adaptation for statistical machine translation (SMT) has resulted in various approaches that adapt system components to specific translation tasks. The concept of a domain, however, is not precisely defined, and most approaches rely on provenance information or manual subcorpus labels, while genre differences have not been addressed explicitly. Motivated by the large translation quality gap that is commonly observed between different genres in a test corpus, we explore the use of document-level genre-revealing text features for the task of translation model adaptation. Results show that automatic indicators of genre can replace manual subcorpus labels, yielding significant improvements across two test sets of up to 0.9 BLEU. In addition, we find that our genre-adapted translation models encourage document-level translation consistency.
Cite
CITATION STYLE
van der Wees, M., Bisazza, A., & Monz, C. (2015). Translation Model Adaptation Using Genre-Revealing Text Features. In DiscoMT 2015 - Discourse in Machine Translation, Proceedings of the Workshop (pp. 132–141). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-2518
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.