We propose a counterfactual approach to train 'causality-aware' predictive models that are able to leverage causal information in static anticausal machine learning tasks (i.e., prediction tasks where the outcome influences the inputs). In applications plagued by confounding, the approach can be used to generate predictions that are free from the influence of observed confounders. In applications involving observed mediators, the approach can be used to generate predictions that only capture the direct or the indirect causal influences. Mechanistically, we train supervised learners on (counterfactually) simulated inputs that retain only the associations generated by the causal relations of interest. We focus on linear models, where analytical results connecting covariances, causal effects, and prediction mean square errors are readily available. Quite importantly, we show that our approach does not require knowledge of the full causal graph. It suffices to know which variables represent potential confounders and/or mediators. We investigate the stability of the method with respect to dataset shifts generated by selection biases and also relax the linearity assumption by extending the approach to additive models better able to account for nonlinearities in the data. We validate our approach in a series of synthetic data experiments and illustrate its application to a real dataset.
CITATION STYLE
Chaibub Neto, E. (2024). Causality-Aware Predictions in Static Anticausal Machine Learning Tasks. IEEE Transactions on Neural Networks and Learning Systems, 35(4), 5039–5053. https://doi.org/10.1109/TNNLS.2022.3202151
Mendeley helps you to discover research relevant for your work.