EGU26-19652, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-19652
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Thursday, 07 May, 12:10–12:20 (CEST)
 
Room M2
Using process-based model simulations to develop and validate a data-driven approach for identifying climate drivers of maize yield failure
Lily-belle Sweet1,2, Christoph Müller3, Jonas Jägermeyr3,4,5, Weston Anderson6, and Jakob Zscheischler1,2,3
Lily-belle Sweet et al.
  • 1Department of Compound Environmental Risks, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany (lily-belle.sweet@ufz.de)
  • 2Department of Hydro Sciences, TUD Dresden University of Technology, Dresden, Germany
  • 3Potsdam Institute for Climate Impact Research (PIK), Member of the Leibniz Association, Potsdam, Germany
  • 4Columbia University, Climate School, New York, NY, USA
  • 5NASA Goddard Institute for Space Studies (GISS), New York, NY, USA
  • 6University of Maryland, College Park, MD, USA

Climate impacts such as crop yield failure arise from complex combinations of weather conditions acting across multiple time scales, making it challenging to identify the most relevant climate drivers from high-resolution weather data. However, with data limitations, and the existence of complex and interacting relationships between growing-season climate conditions and plant growth, complex machine learning models that show high performance in predicting crop yield are often ‘right for the wrong reasons’. Process-based crop model simulations, which embody known functional relationships, could provide a useful testbed for developing and evaluating more trustworthy and robust methods. We present a novel two-stage, data-driven framework designed to extract a parsimonious set of climate drivers from multivariate daily meteorological inputs by systematically generating, evaluating and discarding candidate features using machine learning and then producing a set of drivers that are robust across locations, years and predictive feature combinations. We first validate the method using simulated U.S. maize yield failure data from two global gridded crop models, using rigorous out-of-sample testing: training on only early 20th-century data and holding out over 70 subsequent years for evaluation. The drivers identified using our approach align with known crop model mechanisms and rely solely on model input variables. Parsimonious logistic regression models built from these drivers achieve strong predictive skill under non-stationary climate conditions.

After validating the methodology on simulated data, we apply the same approach to observed county-level yields and daily multivariate weather data in rainfed and irrigated US maize systems. We identify compact sets of five climate drivers that effectively reproduce interannual variability and major historic failure events, including the 1993 Midwest floods and the 2012 drought. In rainfed systems, yield failure risk is strongly associated with extended periods of high soil moisture conditions after establishment, seasonal precipitation levels and vapor pressure deficit (VPD), with more than 40 high-VPD days between flowering and maturity markedly increasing odds of yield failure. In irrigated systems, critical drivers include soil moisture conditions surrounding planting, hot or dry days after establishment, and dewpoint temperatures near harvest. Our results demonstrate the transferability of the method from simulations to observations, and suggest its applicability to other crops, locations and further climate-related impacts. By avoiding reliance on post-hoc interpretability of black-box models, this framework enables the use of inherently interpretable, statistical models while still leveraging the predictive power of high-dimensional meteorological datasets.

How to cite: Sweet, L., Müller, C., Jägermeyr, J., Anderson, W., and Zscheischler, J.: Using process-based model simulations to develop and validate a data-driven approach for identifying climate drivers of maize yield failure, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19652, https://doi.org/10.5194/egusphere-egu26-19652, 2026.