A Turing test for physicality in AI weather models

Sebastian Engelke; Nicola Gnecco; Marco Froelich; Manuel Hentschel; Zhongwei Zhang

doi:https://doi.org/10.5194/egusphere-egu26-18411

[Back] [Session CL5.10]

EGU26-18411, updated on 14 Mar 2026

https://doi.org/10.5194/egusphere-egu26-18411

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

A Turing test for physicality in AI weather models

Sebastian Engelke¹, Nicola Gnecco², Marco Froelich¹, Manuel Hentschel¹, and Zhongwei Zhang³

Sebastian Engelke et al.

¹Research Institute for Statistics and Information Science, University of Geneva, Geneva, Switzerland (sebastian.engelke@unige.ch)
²Department of Mathematics, Imperial College London, London, UK
³Institute of Statistics, Karlsruhe Institute of Technology, Karlsruhe, Germany

Recent AI weather models outperform traditional physics-based weather prediction models on many benchmarks. The evaluation is mostly restricted to point-wise metrics such as the mean squared error and therefore does not assess whether the joint multivariate behavior is well captured. Since AI weather models do not rely on any physical laws, there are strong concerns and first indications that the forecasted fields lack physical consistency in terms of spatial coherence and energy constraints. Verifying such constraints directly is however far from trivial.

We propose a Turing test for physicality that leverages the spread of an ensemble of pre-trained AI forecasting models. The main idea is that the epistemic uncertainty of these models is much larger when applied to non-physical conditions compared to physical conditions that have been part of the training data. We combine this intuition with the theory of conformal inference to obtain a statistical test for physicality with finite-sample guarantees. Case studies on the 1963 Lorenz system show the effectiveness of our proposed approach in identifying conditions that lie outside of its attractor. We then illustrate the applicability of our methodology to recent AI weather models.

How to cite: Engelke, S., Gnecco, N., Froelich, M., Hentschel, M., and Zhang, Z.: A Turing test for physicality in AI weather models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18411, https://doi.org/10.5194/egusphere-egu26-18411, 2026.