Abstract
The empirical adequacy of synchronous context-free grammars of rank two (2-SCFGs) (Satta and Peserico, 2005), used in syntax-based machine translation systems such as Wu (1997), Zhang et al. (2006) and Chiang (2007), in terms of what alignments they induce, has been discussed in Wu (1997) and Wellington et al. (2006), but with a one-sided focus on so-called “inside-out alignments”. Other alignment configurations that cannot be induced by 2-SCFGs are identified in this paper, and their frequencies across a wide collection of hand-aligned parallel corpora are examined. Empirical lower bounds on two measures of alignment error rate, i.e. the one introduced in Och and Ney (2000) and one where only complete translation units are considered, are derived for 2-SCFGs and related formalisms.
Cite
CITATION STYLE
Søgaard, A., & Kuhn, J. (2009). Empirical lower bounds on alignment error rates in syntax-based machine translation. In Proceedings of SSST 2009: 3rd Workshop on Syntax and Structure in Statistical Translation at the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2009 (pp. 19–27). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626344.1626347
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.