Abstract
Due to the rapid development of deep neural networks, in recent years, machine translation software has been widely adopted in people's daily lives, such as communicating with foreigners or understanding political news from the neighbouring countries. However, machine translation software could return incorrect translations because of the complexity of the underlying network. To address this problem, we introduce a novel methodology called PaInv for validating machine translation software. Our key insight is that sentences of different meanings should not have the same translation (i.e., pathological invariance). Specifically, PaInv generates syntactically similar but semantically different sentences by replacing one word in the sentence and filter out unsuitable sentences based on both syntactic and semantic information. We have applied PaInv to Google Translate using 200 English sentences as input with three language settings: English?Hindi, English?Chinese, and English?German. PaInv can accurately find 331 pathological invariants in total, revealing more than 100 translation errors.
Author supplied keywords
Cite
CITATION STYLE
Gupta, S. (2020). Machine Translation Testing via Pathological Invariance. In Proceedings - 2020 ACM/IEEE 42nd International Conference on Software Engineering: Companion, ICSE-Companion 2020 (pp. 107–109). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3377812.3382162
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.