Stabilizing Minimum Error Rate Training

George Foster; Roland Kuhn

Conference Proceedings

Stabilizing Minimum Error Rate Training

EACL 2009 - 4th Workshop on Statistical Machine Translation, Proceedings of theWorkshop (2009) 242-249

DOI: 10.3115/1626431.1626478

26Citations

84Readers

Get full text

Abstract

The most commonly used method for training feature weights in statistical machine translation (SMT) systems is Och's minimum error rate training (MERT) procedure. A well-known problem with Och's procedure is that it tends to be sensitive to small changes in the system, particularly when the number of features is large. In this paper, we quantify the stability of Och's procedure by supplying different random seeds to a core component of the procedure (Powell's algorithm). We show that for systems with many features, there is extensive variation in outcomes, both on the development data and on the test data. We analyze the causes of this variation and propose modifications to the MERT procedure that improve stability while helping performance on test data.

Cite

CITATION STYLE

APA

Foster, G., & Kuhn, R. (2009). Stabilizing Minimum Error Rate Training. In EACL 2009 - 4th Workshop on Statistical Machine Translation, Proceedings of theWorkshop (pp. 242–249). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626431.1626478

Stabilizing Minimum Error Rate Training

Abstract

Cite

Register to see more suggestions