Comparative assessment of environmental variables and machine learning algorithms for maize yield prediction in the US Midwest

139Citations
Citations of this article
260Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Crop yield estimates over large areas are conventionally made using weather observations, but a comprehensive understanding of the effects of various environmental indicators, observation frequency, and the choice of prediction algorithm remains elusive. Here we present a thorough assessment of county-level maize yield prediction in U.S. Midwest using six statistical/machine learning algorithms (Lasso, Support Vector Regressor, Random Forest, XGBoost, Long-short term memory (LSTM), and Convolutional Neural Network (CNN)) and an extensive set of environmental variables derived from satellite observations, weather data, land surface model results, soil maps, and crop progress reports. Results show that seasonal crop yield forecasting benefits from both more advanced algorithms and a large composite of information associated with crop canopy, environmental stress, phenology, and soil properties (i.e. hundreds of features). The XGBoost algorithm outperforms other algorithms both in accuracy and stability, while deep neural networks such as LSTM and CNN are not advantageous. The compositing interval (8-day, 16-day or monthly) of time series variable does not have significant effects on the prediction. Combining the best algorithm and inputs improves the prediction accuracy by 5% when compared to a baseline statistical model (Lasso) using only basic climatic and satellite observations. Reasonable county-level yield foresting is achievable from early June, almost four months prior to harvest. At the national level, early-season (June and July) prediction from the best model outperforms that of the United States Department of Agriculture (USDA) World Agricultural Supply and Demand Estimates (WASDE). This study provides insights into practical crop yield forecasting and the understanding of yield response to climatic and environmental conditions.

Cite

CITATION STYLE

APA

Kang, Y., Ozdogan, M., Zhu, X., Ye, Z., Hain, C., & Anderson, M. (2020). Comparative assessment of environmental variables and machine learning algorithms for maize yield prediction in the US Midwest. Environmental Research Letters, 15(6). https://doi.org/10.1088/1748-9326/ab7df9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free