CHRF deconstructed: β parameters and n-gram weights

45Citations
Citations of this article
77Readers
Mendeley users who have this article in their library.

Abstract

Character n-gram F-score (CHRF) is shown to correlate very well with human rankings of different machine translation outputs, especially for morphologically rich target languages. However, only two versions have been explored so far, namely CHRF1 (standard F-score, β = 1) and CHRF3 (β = 3), both with uniform n-gram weights. In this work, we investigated CHRF in more details, namely β parameters in range from 1/6 to 6, and we found out that CHRF2 is the most promising version. Then we investigated different n-gram weights for CHRF2 and found out that the uniform weights are the best option. Apart from this, CHRF scores were systematically compared with WORDF scores, and a preliminary experiment carried out on small amount of data with direct human scores indicates that the main advantage of CHRF is that it does not penalise too hard acceptable variations in high quality translations.

Cite

CITATION STYLE

APA

Popović, M. (2016). CHRF deconstructed: β parameters and n-gram weights. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2, pp. 499–504). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-2341

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free