iml: An R package for Interpretable Machine Learning

Christoph Molnar

Journal ArticleOPEN ACCESS

iml: An R package for Interpretable Machine Learning

Molnar C

Journal of Open Source Software (2018) 3(26) 786

DOI: 10.21105/joss.00786

N/ACitations

236Readers

Abstract

Complex, non-parametric models, which are typically used in machine learning, have proven to be successful in many prediction tasks. But these models usually operate as black boxes: While they are good at predicting, they are often not interpretable. Many inherently interpretable models have been suggested, which come at the cost of losing predictive power. Another option is to apply interpretability methods to a black box model after model training. Given the velocity of research on new machine learning models, it is preferable to have model-agnostic tools which can be applied to a random forest as well as to a neural network. Tools for model-agnostic interpretability methods should improve the adoption of machine learning. iml is an R package (R Core Team 2016) that offers a general toolbox for making machine learning models interpretable. It implements many model-agnostic methods which work for any type of machine learning model. The package covers following methods: • Partial dependence plots (Friedman 2001): Visualizing the learned relationship between features and predictions. • Individual conditional expectation (Goldstein et al. 2015): Visualizing the learned relationship between features and predictions for individual instances of the data. • Feature importance (Fisher, Rudin, and Dominici 2018): Scoring features by contribution to predictive performance. • Global surrogate tree: Approximating the black box model with an interpretable decision tree. • Local surrogate models (Ribeiro, Singh, and Guestrin 2016): Explaining single predictions by approximating the black box model locally with an interpretable model. • Shapley value (Strumbelj et al. 2014): Explaining single predictions by fairly distributing the predicted value among the features. • Interaction effects (Friedman, Popescu, and others 2008): Measuring how strongly features interact with each other in the black box model. iml was designed to provide a class-based and user-friendly way to make black box machine learning models interpretable. Internally, the implemented methods inherit from the same parent class and share a common framework for the computation. Many of the methods are already implemented in other packages (e.g. (Greenwell 2017), (Gold-stein et al. 2015), (Pedersen and Benesty 2017)), but the iml package implements all of the methods in one place, uses the same syntax and offers consistent functionality and outputs. iml can be used with models from the R machine learning libraries mlr and caret, but the package is flexible enough to work with models from other packages as well. Similar projects are the R package DALEX (Biecek 2018) and the Python package Skater (Choudhary, Kramer, and team 2018). The difference to iml is that the other two projects do not implement the methods themselves, but depend on other packages. DALEX focuses more on model comparison, and Skater additionally includes interpretable models and has less model-agnostic interpretability methods compared to iml. Molnar et al., (2018). iml: An R package for Interpretable Machine Learning. Journal of Open Source Software, 3(27), 786. https://doi.org/10.21105/joss.00786 1 The unified interface provided by the iml package simplifies the analysis and interpretation of black box machine learning learning models.

Cite

CITATION STYLE

APA

Molnar, C. (2018). iml: An R package for Interpretable Machine Learning. Journal of Open Source Software, 3(26), 786. https://doi.org/10.21105/joss.00786

iml: An R package for Interpretable Machine Learning

Abstract

Cite

Register to see more suggestions