Empirical lower bounds on alignment error rates in syntax-based machine translation

Anders Søgaard; Jonas Kuhn

Conference Proceedings

Empirical lower bounds on alignment error rates in syntax-based machine translation

Proceedings of SSST 2009: 3rd Workshop on Syntax and Structure in Statistical Translation at the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2009 (2009) 19-27

DOI: 10.3115/1626344.1626347

12Citations

78Readers

Get full text

Abstract

The empirical adequacy of synchronous context-free grammars of rank two (2-SCFGs) (Satta and Peserico, 2005), used in syntax-based machine translation systems such as Wu (1997), Zhang et al. (2006) and Chiang (2007), in terms of what alignments they induce, has been discussed in Wu (1997) and Wellington et al. (2006), but with a one-sided focus on so-called “inside-out alignments”. Other alignment configurations that cannot be induced by 2-SCFGs are identified in this paper, and their frequencies across a wide collection of hand-aligned parallel corpora are examined. Empirical lower bounds on two measures of alignment error rate, i.e. the one introduced in Och and Ney (2000) and one where only complete translation units are considered, are derived for 2-SCFGs and related formalisms.

Cite

CITATION STYLE

APA

Søgaard, A., & Kuhn, J. (2009). Empirical lower bounds on alignment error rates in syntax-based machine translation. In Proceedings of SSST 2009: 3rd Workshop on Syntax and Structure in Statistical Translation at the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2009 (pp. 19–27). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626344.1626347

Empirical lower bounds on alignment error rates in syntax-based machine translation

Abstract

Cite

Register to see more suggestions