Towards Optimizing MT for Post-Editing Effort: Can BLEU Still Be Useful?

Mikel L. Forcada; Felipe Sánchez-Martínez; Miquel Esplà-Gomis; Lucia Specia

Journal ArticleOPEN ACCESS

Towards Optimizing MT for Post-Editing Effort: Can BLEU Still Be Useful?

Forcada M
Sánchez-Martínez F
Esplà-Gomis M
et al.

The Prague Bulletin of Mathematical Linguistics (2017) 108(1) 183-195

DOI: 10.1515/pralin-2017-0019

N/ACitations

9Readers

Abstract

We propose a simple, linear-combination automatic evaluation measure (AEM) to approximate post-editing (PE) effort. Effort is measured both as PE time and as the number of PE operations performed. The ultimate goal is to define an AEM that can be used to optimize machine translation (MT) systems to minimize PE effort, but without having to perform unfeasible repeated PE during optimization. As PE effort is expected to be an extensive magnitude (i.e., one growing linearly with the sentence length and which may be simply added to represent the effort for a set of sentences), we use a linear combination of extensive and pseudo-extensive features. One such pseudo-extensive feature, 1–BLEU times the length of the reference, proves to be almost as good a predictor of PE effort as the best combination of extensive features. Surprisingly, effort predictors computed using independently obtained reference translations perform reasonably close to those using actual post-edited references. In the early stage of this research and given the inherent complexity of carrying out experiments with professional post-editors, we decided to carry out an automatic evaluation of the AEMs proposed rather than a manual evaluation to measure the effort needed to post-edit the output of an MT system tuned on these AEMs. The results obtained seem to support current tuning practice using BLEU, yet pointing at some limitations. Apart from this intrinsic evaluation, an extrinsic evaluation was also carried out in which the AEMs proposed were used to build synthetic training corpora for MT quality estimation, with results comparable to those obtained when training with measured PE efforts.

Cite

CITATION STYLE

APA

Forcada, M. L., Sánchez-Martínez, F., Esplà-Gomis, M., & Specia, L. (2017). Towards Optimizing MT for Post-Editing Effort: Can BLEU Still Be Useful? The Prague Bulletin of Mathematical Linguistics, 108(1), 183–195. https://doi.org/10.1515/pralin-2017-0019

Towards Optimizing MT for Post-Editing Effort: Can BLEU Still Be Useful?

Abstract

Cite

Register to see more suggestions