What's in a domain? Analyzing genre and topic differences in statistical machine translation

22Citations
Citations of this article
110Readers
Mendeley users who have this article in their library.

Abstract

Domain adaptation is an active field of research in statistical machine translation (SMT), but so far most work has ignored the distinction between the topic and genre of documents. In this paper we quantify and disentangle the impact of genre and topic differences on translation quality by introducing a new data set that has controlled topic and genre distributions. In addition, we perform a detailed analysis showing that differences across topics only explain to a limited degree translation performance differences across genres, and that genre-specific errors are more attributable to model coverage than to suboptimal scoring of translation candidates.

Cite

CITATION STYLE

APA

Van Der Wees, M., Bisazza, A., Weerkamp, W., & Monz, C. (2015). What’s in a domain? Analyzing genre and topic differences in statistical machine translation. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 2, pp. 560–566). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-2092

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free