Cross-Validation Strategy Impacts the Performance and Interpretation of Machine Learning Models

  • Sweet L
  • Müller C
  • Anand M
  • et al.
N/ACitations
Citations of this article
53Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Machine learning algorithms are able to capture complex, nonlinear, interacting relationships and are increasingly used to predict agricultural yield variability at regional and national scales. Using explainable artificial intelligence (XAI) methods applied to such algorithms may enable better scientific understanding of drivers of yield variability. However, XAI methods may provide misleading results when applied to spatiotemporal correlated datasets. In this study, machine learning models are trained to predict simulated crop yield from climate indices, and the impact of cross-validation strategy on the interpretation and performance of the resulting models is assessed. Using data from a process-based crop model allows us to then comment on the plausibility of the “explanations” provided by XAI methods. Our results show that the choice of evaluation strategy has an impact on (i) interpretations of the model and (ii) model skill on held-out years and regions, after the evaluation strategy is used for hyperparameter tuning and feature selection. We find that use of a cross-validation strategy based on clustering in feature space achieves the most plausible interpretations as well as the best model performance on held-out years and regions. Our results provide the first steps toward identifying domain-specific “best practices” for the use of XAI tools on spatiotemporal agricultural or climatic data.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Sweet, L., Müller, C., Anand, M., & Zscheischler, J. (2023). Cross-Validation Strategy Impacts the Performance and Interpretation of Machine Learning Models. Artificial Intelligence for the Earth Systems, 2(4). https://doi.org/10.1175/aies-d-23-0026.1

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 9

60%

Researcher 5

33%

Lecturer / Post doc 1

7%

Readers' Discipline

Tooltip

Computer Science 5

38%

Engineering 4

31%

Environmental Science 2

15%

Agricultural and Biological Sciences 2

15%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free