Pushdown automata in statistical machine translation

Cyril Allauzen; Bill Byrne; Adrià de Gispert; Gonzalo Iglesias; Michael Riley

Journal ArticleOPEN ACCESS

Pushdown automata in statistical machine translation

Computational Linguistics (2014) 40(3) 687-723

DOI: 10.1162/COLI_a_00197

18Citations

118Readers

Get full text

Abstract

This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms.We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT.

Cite

CITATION STYLE

APA

Allauzen, C., Byrne, B., de Gispert, A., Iglesias, G., & Riley, M. (2014). Pushdown automata in statistical machine translation. Computational Linguistics, 40(3), 687–723. https://doi.org/10.1162/COLI_a_00197

Pushdown automata in statistical machine translation

Abstract

Cite

Register to see more suggestions