Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Prototypical parts-based networks are becoming increasingly popular due to their faithful self-explanations. However, their similarity maps are calculated in the penultimate network layer. Therefore, the receptive field of the prototype activation region often depends on parts of the image outside this region, which can lead to misleading interpretations. We name this undesired behavior a spatial explanation misalignment and introduce an interpretability benchmark with a set of dedicated metrics for quantifying this phenomenon. In addition, we propose a method for misalignment compensation and apply it to existing state-of-the-art models. We show the expressiveness of our benchmark and the effectiveness of the proposed compensation methodology through extensive empirical studies.

Cite

CITATION STYLE

APA

Sacha, M., Jura, B., Rymarczyk, D., Struski, Ł., Tabor, J., & Zielinski, B. (2024). Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, pp. 21563–21573). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v38i19.30154

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free