- 1McGill University, Montreal, QC, Canada
- 2Mila - Quebec AI Institute, Montreal, Canada
- 3University of Cambridge, Cambridge, United Kingdom
- 4Karlsruhe Institute of Technology, Karlsruhe, Germany
- 5Intel Labs, Tel Aviv, Israel
Making projections of possible future climates with models is essential to improve our understanding of the causes and implications of anthropogenic climate change. While Earth system models are currently the most complete description of the Earth system, these models are computationally expensive. Simpler models (emulators) are therefore useful to explore the large space of possible future climate scenarios and to generate large ensembles. One class of emulators are simple climate models (SCMs), which model the Earth system with simplified physics. A second class of emulators are statistical models, which learn relationships directly from correlations in climate model data. In this preliminary work, we seek to combine the benefits of the physical grounding of SCMs with those of purely statistical emulators, using tools from causal representation learning. The resulting causal climate emulator may allow exploration of the effect of various interventions on the Earth system, including the effect of changing forcings.
The goal of causal representation learning (CRL) is to simultaneously learn low-dimensional latent representations from high-dimensional data, and a causal graph between these latent representations. In the context of climate model data, we aim to infer latent variables representing regions with shared climate variability from fine-grid climate model data, and causal teleconnections between these regions, representing climate dynamics. We build on recent previous work by Boussard et al., which illustrated how a CRL method, Causal Discovery with Single-parent Decoding (CDSD), may be used for this task. CDSD is a continuous optimization method to learn a distribution over latent variables such that every grid-point observation is driven by a single latent variable, and a causal graph between these latents is also learned.
We illustrate that on surface fields of monthly pre-industrial climate model data, CDSD learns physically-reasonable latent variables but learning a robust causal graph between the latent variables remains a challenge. We evaluate our models on synthetic data that approximate the spatiotemporal structures that we observe in climate model data. By autoregressively rolling out the model we can then generate an ensemble of future climate trajectories with the learned generative model. We develop a Bayesian filter to maintain a constant spatial spectrum throughout our autoregressive rollout, and show that it leads to stable climate prediction. Finally, we explore approaches for including the effect of forcings such as greenhouse gasses in the model.
How to cite: Boussard, J., Hickman, S., Trajkovic, I., Kaltenborn, J., Gurwicz, Y., Nowack, P., and Rolnick, D.: Causal climate emulation, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-13307, https://doi.org/10.5194/egusphere-egu25-13307, 2025.