Abstract
This paper describes the techniques we explored to improve the translation of news text in the German-English and Hungarian-English tracks of the WMT09 shared translation task. Beginning with a convention hierarchical phrase-based system, we found benefits for using word segmentation lattices as input, explicit generation of beginning and end of sentence markers, minimum Bayes risk decoding, and incorporation of a feature scoring the alignment of function words in the hypothesized translation. We also explored the use of monolingual paraphrases to improve coverage, as well as co-training to improve the quality of the segmentation lattices used, but these did not lead to improvements.
Cite
CITATION STYLE
Dyer, C., Setiawan, H., Marton, Y., & Resnik, P. (2009). The University of Maryland Statistical Machine Translation System for the FourthWorkshop on Machine Translation. In EACL 2009 - 4th Workshop on Statistical Machine Translation, Proceedings of theWorkshop (pp. 145–149). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626431.1626461
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.