Validation of an automatic metric for the Accuracy of Pronoun Translation (APT)

47Citations
Citations of this article
76Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we define and assess a reference-based metric to evaluate the accuracy of pronoun translation (APT). The metric automatically aligns a candidate and a reference translation using GIZA++ augmented with specific heuristics, and then counts the number of identical or different pronouns, with provision for legitimate variations and omitted pronouns. All counts are then combined into one score. The metric is applied to the results of seven systems (including the baseline) that participated in the DiscoMT 2015 shared task on pronoun translation from English to French. The APT metric reaches around 0.993-0.999 Pearson correlation with human judges (depending on the parameters of APT), while other automatic metrics such as BLEU, METEOR, or those specific to pronouns used at DiscoMT 2015 reach only 0.972-0.986 Pearson correlation.

Cite

CITATION STYLE

APA

Werlen, L. M., & Popescu-Belis, A. (2017). Validation of an automatic metric for the Accuracy of Pronoun Translation (APT). In DiscoMT 2017 - Discourse in Machine Translation, Proceedings of the Workshop (pp. 17–25). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-4802

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free