On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation

Wei Zhao; Goran Glavaš; Maxime Peyrard; Yang Gao; Robert West; Steffen Eger

Conference ProceedingsOPEN ACCESS

On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2020) 1656-1671

DOI: 10.18653/v1/2020.acl-main.151

48Citations

131Readers

Abstract

Evaluation of cross-lingual encoders is usually performed either via zero-shot cross-lingual transfer in supervised downstream tasks or via unsupervised cross-lingual textual similarity. In this paper, we concern ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations, which represents a natural adversarial setup for multilingual encoders. Reference-free evaluation holds the promise of web-scale comparison of MT systems. We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER. We find that they perform poorly as semantic encoders for reference-free MT evaluation and identify their two key limitations, namely, (a) a semantic mismatch between representations of mutual translations and, more prominently, (b) the inability to punish “translationese”, i.e., low-quality literal translations. We propose two partial remedies: (1) post-hoc re-alignment of the vector spaces and (2) coupling of semantic-similarity based metrics with target-side language modeling. In segment-level MT evaluation, our best metric surpasses reference-based BLEU by 5.7 correlation points. We make our MT evaluation code available.

Cite

CITATION STYLE

APA

Zhao, W., Glavaš, G., Peyrard, M., Gao, Y., West, R., & Eger, S. (2020). On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 1656–1671). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.151

On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation

Abstract

Cite

Register to see more suggestions