Effective use of linguistic and contextual information for statistical machine translation

31Citations
Citations of this article
98Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Current methods of using lexical features in machine translation have difficulty in scaling up to realistic MT tasks due to a prohibitively large number of parameters involved. In this paper, we propose methods of using new linguistic and contextual features that do not suffer from this problem and apply them in a state-ofthe-art hierarchical MT system. The features used in this work are non-terminal labels, non-terminal length distribution, source string context and source dependency LM scores. The effectiveness of our techniques is demonstrated by significant improvements over a strong baseline. On Arabic-to-English translation, improvements in lower-cased BLEU are 2.0 on NIST MT06 and 1.7 on MT08 newswire data on decoding output. On Chinese-to-English translation, the improvements are 1.0 on MT06 and 0.8 on MT08 newswire data. © 2009 ACL and AFNLP.

Cite

CITATION STYLE

APA

Shen, L., Xu, J., Zhang, B., Matsoukas, S., & Weischedel, R. (2009). Effective use of linguistic and contextual information for statistical machine translation. In EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009 (pp. 72–80). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1699510.1699520

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free