Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora

36Citations
Citations of this article
86Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We describe a multi-step process for automatically learning reliable sub-sentential syntactic phrases that are translation equivalents of each other and syntactic translation rules between two languages. The input to the process is a corpus of parallel sentences, word-aligned and annotated with phrase-structure parse trees. We first apply a newly developed algorithm for aligning parse-tree nodes between the two parallel trees. Next, we extract all aligned sub-sentential syntactic constituents from the parallel sentences, and create a syntax-based phrase-table. Finally, we treat the node alignments as tree decomposition points and extract from the corpus all possible synchronous parallel tree fragments. These are then converted into synchronous context-free rules. We describe the approach and analyze its application to Chinese-English parallel data.

Cite

CITATION STYLE

APA

Lavie, A., Parlikar, A., & Ambati, V. (2008). Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora. In Proceedings of SSST 2008 - 2nd Workshop on Syntax and Structure in Statistical Translation (pp. 87–95). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626269.1626280

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free