Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer

38Citations
Citations of this article
36Readers
Mendeley users who have this article in their library.

Abstract

In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix (what has been previously translated at a decoding step). However, previous work on interpretability in NMT has mainly focused solely on source sentence tokens' attributions. Therefore, we lack a full understanding of the influences of every input token (source sentence and target prefix) in the model predictions. In this work, we propose an interpretability method that tracks input tokens' attributions for both contexts. Our method, which can be extended to any encoder-decoder Transformer-based model, allows us to better comprehend the inner workings of current NMT models. We apply the proposed method to both bilingual and multilingual Transformers and present insights into their behaviour.

References Powered by Scopus

Neural machine translation of rare words with subword units

4457Citations
N/AReaders
Get full text

On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation

3200Citations
N/AReaders
Get full text

Transformer Interpretability Beyond Attention Visualization

497Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Hallucinations in Large Multilingual Translation Models

43Citations
N/AReaders
Get full text

Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better

26Citations
N/AReaders
Get full text

Inseq: An interpretability toolkit for sequence generation models

25Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Ferrando, J., Gállego, G. I., Alastruey, B., Escolano, C., & Costa-Jussà, M. R. (2022). Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 8756–8769). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.599

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 9

64%

Researcher 4

29%

Lecturer / Post doc 1

7%

Readers' Discipline

Tooltip

Computer Science 12

71%

Linguistics 3

18%

Neuroscience 1

6%

Engineering 1

6%

Save time finding and organizing research with Mendeley

Sign up for free