Spatial Generalization Tests for Machine Learning-based Weather Models as a Requirement for Climate Predictions

Maren Höver; Milan Klöwer; Christian Schroeder de Witt; Hannah M. Christensen

doi:https://doi.org/10.5194/egusphere-egu26-4512

[Back] [Session ITS1.8/CL0.2]

EGU26-4512, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-4512

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Spatial Generalization Tests for Machine Learning-based Weather Models as a Requirement for Climate Predictions

Maren Höver¹, Milan Klöwer¹, Christian Schroeder de Witt², and Hannah M. Christensen¹

Maren Höver et al.

¹Atmospheric, Oceanic and Planetary Physics, University of Oxford, United Kingdom
²Engineering Science, University of Oxford, United Kingdom

Machine learning-based weather prediction is revolutionizing weather forecasting by learning from present-day climate. However, generalization to other climates remains a major challenge. With melting sea ice, land-use change and increasing ocean temperatures, boundary conditions are changing. Therefore, generalization in time will likely only be possible if generalization in space is also given. The physics of the atmosphere is invariant in space, and as such, a model should demonstrate the same to accurately represent the real world.

Here, we present three test cases to evaluate whether machine learning-based weather and climate models generalize spatially and apply them to multiple AI weather models. The tests consist of reversing the entirety of the input data and boundary conditions in latitude (Test 1), reversing them in longitude (Test 2), as well as rotating them by 180˚ in longitude (Test 3), while keeping all aspects of the simulation physically consistent. For a deterministic model that generalizes in space, each of these test cases yields the same predictions as the baseline case, only subject to a rounding error. With these test cases, we investigate whether data-driven models hardcode representations of spatial relationships in the training data into their latent space. We show that currently, both fully data-driven and hybrid general circulation models do not pass these tests, instead performing poorly with unphysical results. This implies that they have likely not learned underlying atmospheric physics principles, but instead local spatial relationships statistically dependent on geographical location. This calls into question the ability of such models to simulate a changing regional climate. As such, we propose that machine learning-based climate models be evaluated using our spatial tests during model development to reduce overfitting on present-day regional climate.

How to cite: Höver, M., Klöwer, M., Schroeder de Witt, C., and Christensen, H. M.: Spatial Generalization Tests for Machine Learning-based Weather Models as a Requirement for Climate Predictions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4512, https://doi.org/10.5194/egusphere-egu26-4512, 2026.