Unsupervised sub-tree alignment for tree-to-tree translation

9Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

This article presents a probabilistic sub-tree alignment model and its application to tree-to-tree machine translation. Unlike previous work, we do not resort to surface heuristics or expensive annotated data, but instead derive an unsupervised model to infer the syntactic correspondence between two languages. More importantly, the developed model is syntactically-motivated and does not rely on word alignments. As a by-product, our model outputs a sub-tree alignment matrix encoding a large number of diverse alignments between syntactic structures, from which machine translation systems can efficiently extract translation rules that are often filtered out due to the errors in 1-best alignment. Experimental results show that the proposed approach outperforms three state-of-the-art baseline approaches in both alignment accuracy and grammar quality. When applied to machine translation, our approach yields a +1.0 BLEU improvement and a -0.9 TER reduction on the NIST machine translation evaluation corpora. With tree binarization and fuzzy decoding, it even outperforms a state-of-the-art hierarchical phrase-based system. © 2013 AI Access Foundation. All rights reserved.

Cite

CITATION STYLE

APA

Xiao, T., & Zhu, J. (2013). Unsupervised sub-tree alignment for tree-to-tree translation. Journal of Artificial Intelligence Research, 48, 733–782. https://doi.org/10.1613/jair.4033

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free