Abstract
Though phrase-based SMT has achieved high translation quality, it still lacks of generalization ability to capture word order differences between languages. In this paper we describe a general method for tree-to-string phrase-based SMT. We study how syntactic transformation is incorporated into phrase-based SMT and its effectiveness. We design syntactic transformation models using unlexicalized form of synchronous context-free grammars. These models can be learned from source-parsed bitext. Our system can naturally make use of both constituent and non-constituent phrasal translations in the decoding phase. We considered various levels of syntactic analysis ranging from chunking to full parsing. Our experimental results of English-Japanese and English-Vietnamese translation showed a significant improvement over two baseline phrase-based SMT systems.
Cite
CITATION STYLE
Nguyen, T. P., Shimazu, A., Ho, T. B., le Nguyen, M., & van Nguyen, V. (2008). A tree-to-string phrase-based model for statistical machine translation. In CoNLL 2008 - Proceedings of the Twelfth Conference on Computational Natural Language Learning (pp. 143–150). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1596324.1596349
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.