Empirical lower bounds on alignment error rates in syntax-based machine translation

12Citations
Citations of this article
78Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The empirical adequacy of synchronous context-free grammars of rank two (2-SCFGs) (Satta and Peserico, 2005), used in syntax-based machine translation systems such as Wu (1997), Zhang et al. (2006) and Chiang (2007), in terms of what alignments they induce, has been discussed in Wu (1997) and Wellington et al. (2006), but with a one-sided focus on so-called “inside-out alignments”. Other alignment configurations that cannot be induced by 2-SCFGs are identified in this paper, and their frequencies across a wide collection of hand-aligned parallel corpora are examined. Empirical lower bounds on two measures of alignment error rate, i.e. the one introduced in Och and Ney (2000) and one where only complete translation units are considered, are derived for 2-SCFGs and related formalisms.

Cite

CITATION STYLE

APA

Søgaard, A., & Kuhn, J. (2009). Empirical lower bounds on alignment error rates in syntax-based machine translation. In Proceedings of SSST 2009: 3rd Workshop on Syntax and Structure in Statistical Translation at the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2009 (pp. 19–27). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626344.1626347

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free