Objective criteria for explanations of machine learning models

5Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Objective criteria to evaluate the performance of machine learning (ML) model explanations are a critical ingredient in bringing greater rigor to the field of explainable artificial intelligence. In this article, we survey three of our proposed criteria that each target different classes of explanations. In the first, targeted at real-valued feature importance explanations, we define a class of “infidelity” measures that capture how well the explanations match the ML models. We show that instances of such infidelity minimizing explanations correspond to many popular recently proposed explanations and, moreover, can be shown to satisfy well-known game-theoretic axiomatic properties. In the second, targeted to feature set explanations, we define a robustness analysis-based criterion and show that deriving explainable feature sets based on the robustness criterion yields more qualitatively impressive explanations. Lastly, for sample explanations, we provide a decomposition-based criterion that allows us to provide very scalable and compelling classes of sample-based explanations.

Cite

CITATION STYLE

APA

Yeh, C. K., & Ravikumar, P. (2021, December 1). Objective criteria for explanations of machine learning models. Applied AI Letters. John Wiley and Sons Inc. https://doi.org/10.1002/ail2.57

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free