Understanding complex predictive models with ghost variables

2Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Framed in the literature on Interpretable Machine Learning, we propose a new procedure to assign a measure of relevance to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check its out-of-sample performance. We propose to measure the individual relevance of each variable by comparing the predictions of the model in the test set with those obtained when the variable of interest is substituted (in the test set) by its ghost variable, defined as the prediction of this variable by using the rest of explanatory variables. In linear models it is shown that, on the one hand, the proposed measure gives similar results to leave-one-covariate-out (loco, with a lowest computational cost) and outperforms random permutations, and on the other hand, it is strongly related to the usual F-statistic measuring the significance of a variable. In nonlinear predictive models (as neural networks or random forests) the proposed measure shows the relevance of the variables in an efficient way, as shown by a simulation study comparing ghost variables with other alternative methods (including loco and random permutations, and also knockoff variables and estimated conditional distributions). Finally, we study the joint relevance of the variables by defining the relevance matrix as the covariance matrix of the vectors of effects on predictions when using every ghost variable. Our proposal is illustrated with simulated examples and the analysis of a large real data set.

Cite

CITATION STYLE

APA

Delicado, P., & Peña, D. (2023). Understanding complex predictive models with ghost variables. Test, 32(1), 107–145. https://doi.org/10.1007/s11749-022-00826-x

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free