Detecting word sense disambiguation biases in machine translation for model-agnostic adversarial attacks

Denis Emelin; Ivan Titov; Rico Sennrich

Conference ProceedingsOPEN ACCESS

Detecting word sense disambiguation biases in machine translation for model-agnostic adversarial attacks

EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (2020) 7635-7653

DOI: 10.18653/v1/2020.emnlp-main.616

23Citations

97Readers

Abstract

Word sense disambiguation is a well-known source of translation errors in NMT. We posit that some of the incorrect disambiguation choices are due to models' over-reliance on dataset artifacts found in training data, specifically superficial word co-occurrences, rather than a deeper understanding of the source text. We introduce a method for the prediction of disambiguation errors based on statistical data properties, demonstrating its effectiveness across several domains and model types. Moreover, we develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors to further probe the robustness of translation models. Our findings indicate that disambiguation robustness varies substantially between domains and that different models trained on the same data are vulnerable to different attacks.

Cite

CITATION STYLE

APA

Emelin, D., Titov, I., & Sennrich, R. (2020). Detecting word sense disambiguation biases in machine translation for model-agnostic adversarial attacks. In EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 7635–7653). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.emnlp-main.616

Detecting word sense disambiguation biases in machine translation for model-agnostic adversarial attacks

Abstract

Cite

Register to see more suggestions