What Do You See?: Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors

Yi Shan Lin; Wen Chuan Lee; Z. Berkay Celik

Conference ProceedingsOPEN ACCESS

What Do You See?: Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2021) 1027-1035

DOI: 10.1145/3447548.3467213

28Citations

125Readers

Get full text

Abstract

EXplainable AI (XAI) methods have been proposed to interpret how a deep neural network predicts inputs through model saliency explanations that highlight the input parts deemed important to arrive at a decision for a specific target. However, it remains challenging to quantify the correctness of their interpretability as current evaluation approaches either require subjective input from humans or incur high computation cost with automated evaluation. In this paper, we propose backdoor trigger patterns - hidden malicious functionalities that cause misclassification - to automate the evaluation of saliency explanations. Our key observation is that triggers provide ground truth for inputs to evaluate whether the regions identified by an XAI method are truly relevant to its output. Since backdoor triggers are the most important features that cause deliberate misclassification, a robust XAI method should reveal their presence at inference time. We introduce three complementary metrics for the systematic evaluation of explanations that an XAI method generates. We evaluate seven state-of-the-art model-free and model-specific post-hoc methods through 36 models trojaned with specifically crafted triggers using color, shape, texture, location, and size. We found six methods that use local explanation and feature relevance fail to completely highlight trigger regions, and only a model-free approach can uncover the entire trigger region. We made our code available at https://github.com/yslin013/evalxai.

Author supplied keywords

Cite

CITATION STYLE

APA

Lin, Y. S., Lee, W. C., & Celik, Z. B. (2021). What Do You See?: Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1027–1035). Association for Computing Machinery. https://doi.org/10.1145/3447548.3467213

What Do You See?: Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors

Abstract

Author supplied keywords

Cite

Register to see more suggestions