Contrasting Explanations for Understanding and Regularizing Model Adaptations

André Artelt; Fabian Hinder; Valerie Vaquet; Robert Feldhans; Barbara Hammer

Journal ArticleOPEN ACCESS

Contrasting Explanations for Understanding and Regularizing Model Adaptations

Neural Processing Letters (2023) 55(5) 5273-5297

DOI: 10.1007/s11063-022-10826-5

4Citations

23Readers

Abstract

Many of today’s decision making systems deployed in the real world are not static—they are changing and adapting over time, a phenomenon known as model adaptation takes place. Because of their wide reaching influence and potentially serious consequences, the need for transparency and interpretability of AI-based decision making systems is widely accepted and thus have been worked on extensively—e.g. a very prominent class of explanations are contrasting explanations which try to mimic human explanations. However, usually, explanation methods assume a static system that has to be explained. Explaining non-static systems is still an open research question, which poses the challenge how to explain model differences, adaptations and changes. In this contribution, we propose and (empirically) evaluate a general framework for explaining model adaptations and differences by contrasting explanations. We also propose a method for automatically finding regions in data space that are affected by a given model adaptation—i.e. regions where the internal reasoning of the other (e.g. adapted) model changed—and thus should be explained. Finally, we also propose a regularization for model adaptations to ensure that the internal reasoning of the adapted model does not change in an unwanted way.

Author supplied keywords

Cite

CITATION STYLE

APA

Artelt, A., Hinder, F., Vaquet, V., Feldhans, R., & Hammer, B. (2023). Contrasting Explanations for Understanding and Regularizing Model Adaptations. Neural Processing Letters, 55(5), 5273–5297. https://doi.org/10.1007/s11063-022-10826-5

Contrasting Explanations for Understanding and Regularizing Model Adaptations

Abstract

Author supplied keywords

Cite

Register to see more suggestions