ATTEXPLAINER: Explain Transformer via Attention by Reinforcement Learning

4Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

Transformer and its variants, built based on attention mechanisms, have recently achieved remarkable performance in many NLP tasks. Most existing works on Transformer explanation tend to reveal and utilize the attention matrix with human subjective intuitions in a qualitative manner. However, the huge size of dimensions directly challenges these methods to quantitatively analyze the attention matrix. Therefore, in this paper, we propose a novel reinforcement learning (RL) based framework for Transformer explanation via attention matrix, namely ATTEXPLAINER. The RL agent learns to perform step-by-step masking operations by observing the change in attention matrices. We have adapted our method to two scenarios, perturbation-based model explanation and text adversarial attack. Experiments on three widely used text classification benchmarks validate the effectiveness of the proposed method compared to state-of-the-art baselines. Additional studies show that our method is highly transferable and consistent with human intuition. The code of this paper is available at https://github.com/niuzaisheng/AttExplainer.

Cite

CITATION STYLE

APA

Niu, R., Wei, Z., Wang, Y., & Wang, Q. (2022). ATTEXPLAINER: Explain Transformer via Attention by Reinforcement Learning. In IJCAI International Joint Conference on Artificial Intelligence (pp. 724–731). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2022/102

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free