Evaluating pronominal anaphora in machine translation: An evaluation measure and a test suite

Prathyusha Jwalapuram; Shafiq Joty; Irina Temnikova; Preslav Nakov

Conference ProceedingsOPEN ACCESS

Evaluating pronominal anaphora in machine translation: An evaluation measure and a test suite

EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (2019) 2964-2975

DOI: 10.18653/v1/D19-1294

17Citations

86Readers

Abstract

The neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures such as BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. Finally, we conduct a user study and we report correlation with human judgments.

Cite

CITATION STYLE

APA

Jwalapuram, P., Joty, S., Temnikova, I., & Nakov, P. (2019). Evaluating pronominal anaphora in machine translation: An evaluation measure and a test suite. In EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (pp. 2964–2975). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1294

Evaluating pronominal anaphora in machine translation: An evaluation measure and a test suite

Abstract

Cite

Register to see more suggestions