EGU25-4253, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-4253
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Thursday, 01 May, 08:40–08:50 (CEST)
 
Room 0.11/12
Unraveling the sources of subseasonal predictability with machine learning
Ana-Cristina Mârza1,2, Daniela I.V. Domeisen2,3, Lorenzo Ramella-Pralungo4, and Angela Meyer1,5
Ana-Cristina Mârza et al.
  • 1Bern University of Applied Sciences, Bern, Switzerland
  • 2University of Lausanne, Lausanne, Switzerland
  • 3ETH Zurich, Zurich, Switzerland
  • 4DXT Commodities, Lugano, Switzerland
  • 5TU Delft, Delft, Netherlands

Actionable weather information at the subseasonal timescale informs decision-makers in many societally relevant sectors, including energy demand and supply. However, the predictive skill of subseasonal forecasts varies widely: from forecast ‘busts’ with low predictive skill, to windows of opportunity yielding exceptionally skillful forecasts. It is therefore useful to know ahead of time if a given forecast will be skillful enough to form the basis of operational planning: i.e., along with the forecast itself, users wish to have an a priori estimate of the forecast uncertainty. We propose to achieve this with machine learning (ML). In our study, an ML model trained on historical weather data learns to relate the forecast initial conditions to the probabilistic forecast error at subseasonal lead times. As opposed to ensemble forecasting, this is a computationally cheaper approach to estimate the forecast skill. Moreover, explainability techniques allow us to rank the sources of subseasonal predictability in hindcast data by their importance; a first to our knowledge.

Building on studies that examine the link between the forecast skill of the European Centre for Medium-range Weather Forecasts (ECMWF) subseasonal ensemble model, and the atmospheric conditions at forecast initialization time (weather regime, season, phase of the Madden-Julian Oscillation), we propose a decision-tree-based approach to predicting future forecast skill from past observations. Concretely, a gradient boosted decision tree model is trained to predict the Continuous Ranked Probability Score (CRPS) of ECMWF hindcasts at lead times 0-46 days, by leveraging initial conditions (geopotential height, sea surface temperature, zonal wind speed) extracted from the Earth System Reanalysis 5 (ERA5) dataset. The ERA5 data undergo dimensionality reduction (e.g., principal component analysis) before being fed to the ML model, and are supplemented with pre-computed indices like the El Niño-Southern Oscillation Index. Forecast skill is computed for the 500 hPa geopotential height field in the European region with respect to ERA5 ground truth.

The ML model outperforms a climatological baseline (averaged CRPS by calendar date and lead time) at the task of predicting European forecast skill out to week 7. We find the most important predictor of skill to be the strength of the stratospheric polar vortex, in addition to lead time and calendar date. Training separate models by lead time reveals clear differences in feature importance, such that, for example, lead time contributes the most predictability in the first 2 weeks, while the seasonal cycle is a strong predictor in weeks 3-4. Different teleconnections become important at different lead times, but their predictive potential also fluctuates throughout the year. We will provide an in-depth breakdown of the feature importances by lead time and season in our presentation.

In conclusion, machine learning provides a novel way to estimate a priori the forecast skill of numerical weather prediction models. The presented method enables us for the first time to rank the relative contributions of the sources of forecast skill, as deduced from hindcast data, thereby advancing our understanding of subseasonal predictability.

How to cite: Mârza, A.-C., Domeisen, D. I. V., Ramella-Pralungo, L., and Meyer, A.: Unraveling the sources of subseasonal predictability with machine learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-4253, https://doi.org/10.5194/egusphere-egu25-4253, 2025.