Fill-up versus Interpolation Methods for Phrase-based SMT Adaptation

Arianna Bisazza; Nick Ruiz; Marcello Federico

Conference Proceedings

Fill-up versus Interpolation Methods for Phrase-based SMT Adaptation

2011 International Workshop on Spoken Language Translation, IWSLT 2011 (2011) 136-143

56Citations

64Readers

Abstract

This paper compares techniques to combine diverse parallel corpora for domain-specific phrase-based SMT system training. We address a common scenario where little in-domain data is available for the task, but where large background models exist for the same language pair. In particular, we focus on phrase table fill-up: a method that effectively exploits background knowledge to improve model coverage, while preserving the more reliable information coming from the in-domain corpus. We present experiments on an emerging transcribed speech translation task - the TED talks. While performing similarly in terms of BLEU and NIST scores to the popular log-linear and linear interpolation techniques, filled-up translation models are more compact and easy to tune by minimum error training.

Cite

CITATION STYLE

APA

Bisazza, A., Ruiz, N., & Federico, M. (2011). Fill-up versus Interpolation Methods for Phrase-based SMT Adaptation. In 2011 International Workshop on Spoken Language Translation, IWSLT 2011 (pp. 136–143). International Society for Computers and Their Applications (ISCA).

Fill-up versus Interpolation Methods for Phrase-based SMT Adaptation

Abstract

Cite

Register to see more suggestions