Maize and wheat yield forecasting in the Pannonian Basin using extreme gradient boosting and its performance in years of severe drought
- 1TU Wien, Mathematics and Geoinformation, Geodesy and Geoinformation, Wien, Austria (emanuel.bueechi@geo.tuwien.ac.at)
- 2Global Change Research Institute CAS, CzechGlobe, Bělidla 986/4a, 603 00, Brno, Czech Republic
- 3Institute of Geodesy and Photogrammetry, ETH Zurich, Robert-Gnehm-Weg 15, 8093, Zurich, Switzerland
- 4SPACE-SI, Slovenian Centre of Excellence for Space Sciences and Technologies, Aškerčeva 12, 1000, Ljubljana, Slovenia
The increasing frequency and intensity of severe droughts over recent decades have significantly impacted crop production in the Pannonian Basin in southeastern Europe. Related crop yield losses can be substantial and require logistic compensation on an international level. To plan such compensations, seasonal crop yield forecasts have proven to be a valuable tool to support decision-makers in taking timely action. However, the impact of severe droughts on crop yields is often underestimated by such forecasts. To address this issue, we developed a maize and wheat yield forecasting system based on extreme-gradient-boosting machine learning for 42 regions in the Pannonian Basin. The used predictors describe vegetation state, weather, and soil moisture conditions derived from Earth observation, reanalysis, in-situ data, and seasonal weather forecasts. The wide range of predictors was selected to represent the state of the crops and the conditions they are facing and are expected to face. We expected it to be crucial, especially during severe drought years, to provide the model with sufficient information about the drought and its impacts. Afterwards, the model was validated, with a focus on drought years.
Our results show that crop yield anomaly estimates in the two months preceding harvest have better performance than earlier in the year (relative root mean square errors below 17%) in all years. The models have their clear strength in forecasting interannual variabilities but struggle to forecast differences between regions within individual years. This is related to spatial autocorrelations and a lower spatial than temporal variability of crop yields. In years of severe droughts, there is a clear improvement in the forecasts with a 2-month lead time over longer forecasts too. The crop yield losses remain underestimated, but the wheat model performs in drought years better than for average years with errors below 12%. The errors of the maize forecasts in drought years are larger than for non-drought years: 30% two months ahead and 20% one month ahead. The feature importance analysis shows that in general wheat yield anomalies are controlled by temperature and maize by water availability during the last two months before harvest. In severe drought years, soil moisture is the most important predictor for the maize model and the seasonal temperature forecast becomes key for wheat forecasts two months before harvest. Going forward, a finer spatial resolution of the predictors will be tested to better distinguish the yields between the different regions. In addition, longer time-series of crop yield data, including more data during severe drought years, will help to test the findings of this study.
How to cite: Bueechi, E., Fischer, M., Crocetti, L., Trnka, M., Zappa, L., Grlj, A., and Dorigo, W.: Maize and wheat yield forecasting in the Pannonian Basin using extreme gradient boosting and its performance in years of severe drought, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-15519, https://doi.org/10.5194/egusphere-egu23-15519, 2023.