We define, implement and evaluate a novel model for statistical machine translation, which is based on shallow syntactic analysis (part-of-speech tagging and phrase chunking) in both the source and target languages. It is able to model long-distance constituent motion and other syntactic phenomena without requiring a full parse in either language. We also examine aspects of lexical transfer, suggesting and exploring a concept of translation coercion across parts of speech, as well as a transfer model based on lemma-to-lemma translation probabilities, which holds promise for improving machine translation of low-density languages. Experiments are performed in both Arabic-to-English and French-to-English translation demonstrating the efficacy of the proposed techniques. Performance is automatically evaluated via the Bleu score metric.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Schafer, C., & Yarowsky, D. (2003). Statistical Machine Translation Using Coercive Two-Level Syntactic Transduction. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP 2003 (pp. 9–16). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1119355.1119357