- 1University of Lausanne, Faculty of Geosciences and Environment, Lausanne, Switzerland (tom.beucler@unil.ch)
- 2University of California, Los Angeles, USA
- 3The Hong Kong University of Science and Technology, Hong Kong SAR, China
- 4Allen Institute for Artificial Intelligence (AI2), Seattle, USA
- 5University of Colorado, Boulder, USA
- 6California Institute of Technology, Pasadena, USA
- 7Google Research, San Francisco, USA
- 8New York University, New York, USA
Are AI-powered climate models intrinsically more efficient than traditional climate models?
While progress is still needed before they become operational, hybrid AI-physics climate models and AI emulators of climate models have the potential to sharply reduce inference cost relative to traditional CPU-based models, allowing larger ensembles to explore different scenarios and sharpen uncertainty estimation. Yet this apparent efficiency becomes less obvious when the comparison includes GPU-ported dynamical climate models, and when efficiency is assessed against the effective complexity of the simulated climate system.
As a first step, recognizing that a perfect apple-to-apple comparison is rarely possible from reported configurations, we synthesize reported performance for leading AI climate model emulators (e.g., ACE2, CAMulator), hybrid AI-physics models (e.g., CliMA, NeuralGCM), and GPU-accelerated traditional models (e.g., SCREAM, ICON). We examine two complementary scaling views. The first compares throughput (simulated years per day) per accelerator (GPUs or TPUs) and per prognostic variable, as a function of horizontal grid spacing. The second compares the same normalized throughput against an effective complexity proxy, defined as the number of vertical levels divided by the product of the time step and the squared horizontal grid spacing, to account for the simulated vertical structure and, importantly, time-step constraints imposed by numerical stability.
We find that AI-powered models can show favorable apparent scaling with horizontal resolution in raw throughput, but that the advantage becomes modest once effective complexity is accounted for: at comparable complexity, AI climate models do not appear intrinsically more efficient than GPU-ported dynamical models. Hybrid approaches occupy a distinct middle ground: their acceleration and added value come primarily from learned parameterizations that improve the representation of unresolved processes while the overall model retains a physically-based dynamical core, including explicit conservation laws. AI climate model emulators, by contrast, offer their clearest computational advantage through task-targeted prediction, where a limited set of climate-relevant variables can be directly simulated on the grid of interest. This avoids integrating the full high-frequency, multivariate state at the short time step traditionally required for numerical stability, which is especially advantageous when emulating a fine-resolution reference model with a coarser emulator. Diverse downscaling or targeted post-processing strategies can further substitute for explicit fine-scale resolution when observations are available, enabling inexpensive local or hazard-specific risk assessment at decadal to multi-decadal time horizons.
How to cite: Beucler, T., Neelin, D., Su, H., Bretherton, C., Chapman, W., Christopoulos, C., Grover, A., Lopez-Gomez, I., Schneider, T., Subel, A., Watt-Meyer, O., and Zanna, L.: Reassessing the Scaling of AI-Powered Climate Models Against Dynamical Counterparts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18440, https://doi.org/10.5194/egusphere-egu26-18440, 2026.