The MUCOW test suite at WMT 2019: Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation

Alessandro Raganato; Yves Scherrer; Jörg Tiedemann

Conference ProceedingsOPEN ACCESS

The MUCOW test suite at WMT 2019: Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation

WMT 2019 - 4th Conference on Machine Translation, Proceedings of the Conference (2019) 2 470-480

DOI: 10.18653/v1/w19-5354

38Citations

73Readers

Abstract

Supervised Neural Machine Translation (NMT) systems currently achieve impressive translation quality for many language pairs. One of the key features of a correct translation is the ability to perform word sense disambiguation (WSD), i.e., to translate an ambiguous word with its correct sense. Existing evaluation benchmarks on WSD capabilities of translation systems rely heavily on manual work and cover only few language pairs and few word types. We present MUCOW, a multilingual contrastive test suite that covers 16 language pairs with more than 200 000 contrastive sentence pairs, automatically built from word-aligned parallel corpora and the wide-coverage multilingual sense inventory of BabelNet. We evaluate the quality of the ambiguity lexicons and of the resulting test suite on all submissions from 9 language pairs presented in the WMT19 news shared translation task, plus on other 5 language pairs using pretrained NMT models. The MUCOW test suite is available at http://github.com/Helsinki-NLP/MuCoW.

Cite

CITATION STYLE

APA

Raganato, A., Scherrer, Y., & Tiedemann, J. (2019). The MUCOW test suite at WMT 2019: Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation. In WMT 2019 - 4th Conference on Machine Translation, Proceedings of the Conference (Vol. 2, pp. 470–480). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-5354

The MUCOW test suite at WMT 2019: Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation

Abstract

Cite

Register to see more suggestions