Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model's decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highlight that such a benchmark facilitates not only assessing the effectiveness of explanations but also their efficiency.
CITATION STYLE
Idahl, M., Lyu, L., Gadiraju, U., & Anand, A. (2021). Towards Benchmarking the Utility of Explanations for Model Debugging. In TrustNLP 2021 - 1st Workshop on Trustworthy Natural Language Processing, Proceedings of the Workshop (pp. 68–73). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.trustnlp-1.8
Mendeley helps you to discover research relevant for your work.