Pulling Out All The Full Stops: Punctuation Sensitivity in Neural Machine Translation and Evaluation

Prathyusha Jwalapuram

Conference Proceedings

Pulling Out All The Full Stops: Punctuation Sensitivity in Neural Machine Translation and Evaluation

Jwalapuram P

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 6116-6130

DOI: 10.18653/v1/2023.findings-acl.381

1Citations

14Readers

Get full text

Abstract

Much of the work testing machine translation systems for robustness and sensitivity has been adversarial or tended towards testing noisy input such as spelling errors, or non-standard input such as dialects. In this work, we take a step back to investigate a sensitivity problem that can seem trivial and is often overlooked: punctuation. We perform basic sentence-final insertion and deletion perturbation tests with full stops, exclamation and questions marks across source languages and demonstrate a concerning finding: commercial, production-level machine translation systems are vulnerable to mere single punctuation insertion or deletion, resulting in unreliable translations. Moreover, we demonstrate that both string-based and model-based evaluation metrics also suffer from this vulnerability, producing significantly different scores when translations only differ in a single punctuation, with model-based metrics penalizing each punctuation differently. Our work calls into question the reliability of machine translation systems and their evaluation metrics, particularly for real-world use cases, where inconsistent punctuation is often the most common and the least disruptive noise.

Cite

CITATION STYLE

APA

Jwalapuram, P. (2023). Pulling Out All The Full Stops: Punctuation Sensitivity in Neural Machine Translation and Evaluation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 6116–6130). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.381

Pulling Out All The Full Stops: Punctuation Sensitivity in Neural Machine Translation and Evaluation

Abstract

Cite

Register to see more suggestions