Explanations for Attributing Deep Neural Network Predictions

37Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Given the recent success of deep neural networks and their applications to more high impact and high risk applications, like autonomous driving and healthcare decision-making, there is a great need for faithful and interpretable explanations of “why” an algorithm is making a certain prediction. In this chapter, we introduce 1. Meta-Predictors as Explanations, a principled framework for learning explanations for any black box algorithm, and 2. Meaningful Perturbations, an instantiation of our paradigm applied to the problem of attribution, which is concerned with attributing what features of an input (i.e., regions of an input image) are responsible for a model’s output (i.e., a CNN classifier’s object class prediction). We first introduced these contributions in [8]. We also briefly survey existing visual attribution methods and highlight how they faith to be both faithful and interpretable.

Cite

CITATION STYLE

APA

Fong, R., & Vedaldi, A. (2019). Explanations for Attributing Deep Neural Network Predictions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11700 LNCS, pp. 149–167). Springer Verlag. https://doi.org/10.1007/978-3-030-28954-6_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free