Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora

Alon Lavie; Alok Parlikar; Vamshi Ambati

Conference Proceedings

Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora

Proceedings of SSST 2008 - 2nd Workshop on Syntax and Structure in Statistical Translation (2008) 87-95

DOI: 10.3115/1626269.1626280

36Citations

86Readers

Get full text

Abstract

We describe a multi-step process for automatically learning reliable sub-sentential syntactic phrases that are translation equivalents of each other and syntactic translation rules between two languages. The input to the process is a corpus of parallel sentences, word-aligned and annotated with phrase-structure parse trees. We first apply a newly developed algorithm for aligning parse-tree nodes between the two parallel trees. Next, we extract all aligned sub-sentential syntactic constituents from the parallel sentences, and create a syntax-based phrase-table. Finally, we treat the node alignments as tree decomposition points and extract from the corpus all possible synchronous parallel tree fragments. These are then converted into synchronous context-free rules. We describe the approach and analyze its application to Chinese-English parallel data.

Cite

CITATION STYLE

APA

Lavie, A., Parlikar, A., & Ambati, V. (2008). Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora. In Proceedings of SSST 2008 - 2nd Workshop on Syntax and Structure in Statistical Translation (pp. 87–95). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626269.1626280

Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora

Abstract

Cite

Register to see more suggestions