Reassessing the Scaling of AI-Powered Climate Models Against Dynamical Counterparts

Tom Beucler; David Neelin; Hui Su; Christopher Bretherton; Will Chapman; Costa Christopoulos; Aditya Grover; Ignacio Lopez-Gomez; Tapio Schneider; Adam Subel; Oliver Watt-Meyer; Laure Zanna

doi:https://doi.org/10.5194/egusphere-egu26-18440

[Back] [Session ITS1.7/CL0.3]

EGU26-18440, updated on 14 Mar 2026

https://doi.org/10.5194/egusphere-egu26-18440

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Reassessing the Scaling of AI-Powered Climate Models Against Dynamical Counterparts

Tom Beucler¹, David Neelin², Hui Su³, Christopher Bretherton⁴, Will Chapman⁵, Costa Christopoulos⁶, Aditya Grover², Ignacio Lopez-Gomez⁷, Tapio Schneider⁶, Adam Subel⁸, Oliver Watt-Meyer⁴, and Laure Zanna⁸

Tom Beucler et al.

¹University of Lausanne, Faculty of Geosciences and Environment, Lausanne, Switzerland (tom.beucler@unil.ch)
²University of California, Los Angeles, USA
³The Hong Kong University of Science and Technology, Hong Kong SAR, China
⁴Allen Institute for Artificial Intelligence (AI2), Seattle, USA
⁵University of Colorado, Boulder, USA
⁶California Institute of Technology, Pasadena, USA
⁷Google Research, San Francisco, USA
⁸New York University, New York, USA

Are AI-powered climate models intrinsically more efficient than traditional climate models?

While progress is still needed before they become operational, hybrid AI-physics climate models and AI emulators of climate models have the potential to sharply reduce inference cost relative to traditional CPU-based models, allowing larger ensembles to explore different scenarios and sharpen uncertainty estimation. Yet this apparent efficiency becomes less obvious when the comparison includes GPU-ported dynamical climate models, and when efficiency is assessed against the effective complexity of the simulated climate system.

As a first step, recognizing that a perfect apple-to-apple comparison is rarely possible from reported configurations, we synthesize reported performance for leading AI climate model emulators (e.g., ACE2, CAMulator), hybrid AI-physics models (e.g., CliMA, NeuralGCM), and GPU-accelerated traditional models (e.g., SCREAM, ICON). We examine two complementary scaling views. The first compares throughput (simulated years per day) per accelerator (GPUs or TPUs) and per prognostic variable, as a function of horizontal grid spacing. The second compares the same normalized throughput against an effective complexity proxy, defined as the number of vertical levels divided by the product of the time step and the squared horizontal grid spacing, to account for the simulated vertical structure and, importantly, time-step constraints imposed by numerical stability.

We find that AI-powered models can show favorable apparent scaling with horizontal resolution in raw throughput, but that the advantage becomes modest once effective complexity is accounted for: at comparable complexity, AI climate models do not appear intrinsically more efficient than GPU-ported dynamical models. Hybrid approaches occupy a distinct middle ground: their acceleration and added value come primarily from learned parameterizations that improve the representation of unresolved processes while the overall model retains a physically-based dynamical core, including explicit conservation laws. AI climate model emulators, by contrast, offer their clearest computational advantage through task-targeted prediction, where a limited set of climate-relevant variables can be directly simulated on the grid of interest. This avoids integrating the full high-frequency, multivariate state at the short time step traditionally required for numerical stability, which is especially advantageous when emulating a fine-resolution reference model with a coarser emulator. Diverse downscaling or targeted post-processing strategies can further substitute for explicit fine-scale resolution when observations are available, enabling inexpensive local or hazard-specific risk assessment at decadal to multi-decadal time horizons.

How to cite: Beucler, T., Neelin, D., Su, H., Bretherton, C., Chapman, W., Christopoulos, C., Grover, A., Lopez-Gomez, I., Schneider, T., Subel, A., Watt-Meyer, O., and Zanna, L.: Reassessing the Scaling of AI-Powered Climate Models Against Dynamical Counterparts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18440, https://doi.org/10.5194/egusphere-egu26-18440, 2026.

Supplementary materials

Supplementary material file

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 05 May 2026, no comments