Machine Translation Testing via Pathological Invariance

Shashij Gupta

Conference ProceedingsOPEN ACCESS

Machine Translation Testing via Pathological Invariance

Gupta S

Proceedings - 2020 ACM/IEEE 42nd International Conference on Software Engineering: Companion, ICSE-Companion 2020 (2020) 107-109

DOI: 10.1145/3377812.3382162

1Citations

26Readers

Get full text

Abstract

Due to the rapid development of deep neural networks, in recent years, machine translation software has been widely adopted in people's daily lives, such as communicating with foreigners or understanding political news from the neighbouring countries. However, machine translation software could return incorrect translations because of the complexity of the underlying network. To address this problem, we introduce a novel methodology called PaInv for validating machine translation software. Our key insight is that sentences of different meanings should not have the same translation (i.e., pathological invariance). Specifically, PaInv generates syntactically similar but semantically different sentences by replacing one word in the sentence and filter out unsuitable sentences based on both syntactic and semantic information. We have applied PaInv to Google Translate using 200 English sentences as input with three language settings: English?Hindi, English?Chinese, and English?German. PaInv can accurately find 331 pathological invariants in total, revealing more than 100 translation errors.

Author supplied keywords

Cite

CITATION STYLE

APA

Gupta, S. (2020). Machine Translation Testing via Pathological Invariance. In Proceedings - 2020 ACM/IEEE 42nd International Conference on Software Engineering: Companion, ICSE-Companion 2020 (pp. 107–109). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3377812.3382162

Machine Translation Testing via Pathological Invariance

Abstract

Author supplied keywords

Cite

Register to see more suggestions