- 1Google Deepmind
- 2Inria France
- 3Otto-von-Guericke-University Magdeburg
- 4University of Potsdam
The last 5 years have seen an AI revolution in weather forecasting with data-driven models trained on ERA5 (such as Pangu-Weather, GraphCast) surpassing the skill of numerical models at a fraction of the compute costs . Furthermore, stochastic modeling approaches are now state-of-the-art as they can model the uncertainty in the dynamics of the earth system (GenCast, FGN). Similarly, there have been recent advances in long-term climate emulation using data-driven methods, although they either use deterministic models (ACE2, Lucie) or are trained on simulated climate data from physical models (ArchesClimate). Here, we evaluate a stochastic modeling approach, ArchesWeatherGen, on historical climate timescales (last 40 years) and its response to ocean forcings in an AMIP run setup (atmospheric model forced with sea surface temperature and sea ice). These simulations contribute to AIMIP (AI Model Intercomparison Project), an initiative to organize and compare the current state-of-the-art AI climate models.
ArchesWeather and ArchesWeatherGen are efficient data-driven models built for medium-range weather forecasting. ArchesWeather is a deterministic transformer-based model and ArchesWeatherGen is a probabilistic generative model based on flow matching, with the same transformer backbone, that corrects the deterministic model prediction and accounts for variability in the time evolution.
In adherence to the AIMIP Stage 1 protocol, we adapt the models to serve as an atmospheric climate model for AMIP climate simulations on the historical period of 1979-2024. ArchesWeather and ArchesWeatherGen are extended to take into account monthly mean forcings for sea surface temperature (SST) and sea ice cover computed from ERA5. These models are trained on daily averaged 1-degree ERA5 data and they predict the state of the atmosphere at a forecast lead time of 24 hours given initial conditions.
We examine the ability of both models to stably emulate the current climate by quantitatively and qualitatively comparing them to the ERA5 climatology. Our results show that the models are able to emulate the current climate faithfully and reproduce many teleconnections as well as modes of annular variability correctly. We ablate different model configurations against each other and investigate the influence of the residual predictions of ArchesWeatherGen on the quality of the climate simulations compared to the deterministic predictions of ArchesWeather. We also analyse the models' capability to reproduce extreme weather statistics. Lastly, we examine the models’ response to forcings by evaluating the stability, trend, and physical correlations when running the model in different forcing scenarios, such as no forcings, annually repeating forcings, and increased SST.
How to cite: Singh, R., Brunstein, R., Jost, A. A., Hasson, Y., Couairon, G., Lessig, C., and Monteleoni, C.: Evaluating ArchesWeather and ArchesWeatherGen under Multi-Decadal AMIP-style climate simulations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15037, https://doi.org/10.5194/egusphere-egu26-15037, 2026.