Fill-up versus Interpolation Methods for Phrase-based SMT Adaptation

56Citations
Citations of this article
64Readers
Mendeley users who have this article in their library.

Abstract

This paper compares techniques to combine diverse parallel corpora for domain-specific phrase-based SMT system training. We address a common scenario where little in-domain data is available for the task, but where large background models exist for the same language pair. In particular, we focus on phrase table fill-up: a method that effectively exploits background knowledge to improve model coverage, while preserving the more reliable information coming from the in-domain corpus. We present experiments on an emerging transcribed speech translation task - the TED talks. While performing similarly in terms of BLEU and NIST scores to the popular log-linear and linear interpolation techniques, filled-up translation models are more compact and easy to tune by minimum error training.

Cite

CITATION STYLE

APA

Bisazza, A., Ruiz, N., & Federico, M. (2011). Fill-up versus Interpolation Methods for Phrase-based SMT Adaptation. In 2011 International Workshop on Spoken Language Translation, IWSLT 2011 (pp. 136–143). International Society for Computers and Their Applications (ISCA).

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free