Using Artificially Generated Data to Evaluate Statistical Machine Translation

2Citations
Citations of this article
68Readers
Mendeley users who have this article in their library.

Abstract

Although Statistical Machine Translation (SMT) is now the dominant paradigm within Machine Translation, we argue that it is far from clear that it can outperform Rule-Based Machine Translation (RBMT) on small-to medium-vocabulary applications where high precision is more important than recall. A particularly important practical example is medical speech translation. We report the results of experiments where we configured the various grammars and rule-sets in an Open Source medium-vocabulary multi-lingual medical speech translation system to generate large aligned bilingual corpora for English → French and English → Japanese, which were then used to train SMT models based on the common combination of Giza++, Moses and SRILM. The resulting SMTs were unable fully to reproduce the performance of the RBMT, with performance topping out, even for English → French, with less than 70% of the SMT translations of previously unseen sentences agreeing with RBMT translations. When the outputs of the two systems differed, human judges reported the SMT result as frequently being worse than the RBMT result, and hardly ever better; moreover, the added robustness of the SMT only yielded a small improvement in recall, with a large penalty in precision.

Cite

CITATION STYLE

APA

Rayner, M., Estrella, P., Bouillon, P., Hockey, B. A., & Nakao, Y. (2009). Using Artificially Generated Data to Evaluate Statistical Machine Translation. In ACL-IJCNLP 2009 - GEAF 2009: 2009 Workshop on Grammar Engineering Across Frameworks, Proceedings of the Workshop (pp. 54–62). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1690359.1690366

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free