EGU24-10759, updated on 08 Mar 2024
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Is linear regression all you need? Clarifying use-cases for deep learning in climate emulation

Björn Lütjens1, Noelle Selin1,6, Andre Souza1, Gosha Geogdzhayev2, Dava Newman3, Paolo Giani1, Claudia Tebaldi4, Duncan Watson-Parris5, and Raffaele Ferrari1
Björn Lütjens et al.
  • 1Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology
  • 2Department of Physics, Massachusetts Institute of Technology
  • 3MIT Media Lab
  • 4Pacific Northwest National Laboratory
  • 5Scripps Institution of Oceanography, University of California San Diego
  • 6Institute of Data, Systems and Society, Massachusetts Institute of Technology

Motivation. Climate models are computationally so expensive that each model is only run for a very selected set of assumptions. In policy making, this computational complexity makes it difficult to rapidly explore the comparative impact of climate policies, such as quantifying the projected difference of local climate impacts with a 30 vs. 45€ price on carbon (Lütjens et al., 2023). Recently however, machine learning (ML) models have been used to emulate climate models that can rapidly interpolate within existing climate dataset.

Related Works. Several deep learning models have been developed to emulate the impact of greenhouse gas emissions onto climate variables such as temperature and precipitation. Currently, the foundation model ClimaX with O(100M-1B) parameters is considered the best performer according to the benchmark datasets, ClimateSet and ClimateBenchv1.0 (Kaltenborn et al., 2023; Nguyen et al., 2023; Watson-Parris et al., 2022).

Results. We show that linear pattern scaling, a simple method with O(10K) parameters, is at least on par with the best models for some climate variables, as shown in Fig 1. In particular, the ClimateBenchv1.0 annually-averaged and locally-resolved surface temperatures, precipitation, and 90th percentile precipitation can be well estimated with linear pattern scaling. Our research resurfaces that temperature-dependent climate variables have a mostly linear relationship to cumulative CO2 emissions.

As a next step, we will identify the complex climate emulation tasks that are not addressed by linear models and might benefit from deep learning research. To do so, we will plot the data complexity per climate variable and discuss the ML difficulties in multiple spatiotemporal scales, irreversible dynamics, and internal variability. We will conclude with a list of tasks that demand more advanced ML models.

Conclusion. Most of the ML-based climate emulation efforts have focused on variables that can be well approximated by linear regression models. Our study reveals the solved and unsolved problems in climate emulation and provides guidance for future research directions.

Data and Methods. We use the ClimateBenchv1.0 dataset and will show additional results on ClimateSet and a CMIP climate model that contains many ensemble members. Our model fits one linear regression to map cumulative CO2 emissions, co2(t), to globally- and annually-averaged surface temperature, tas(t). Our model then fits one linear regression model per grid cell to map tas(t) onto 2.5° local surface temperature. Our model is time-independent and uses only co2(t) as input. Our analysis will be available at


Kaltenborn, J. et al., (2023). ClimateSet: A Large-Scale Climate Model Dataset for Machine Learning, in NeurIPS Datasets and Benchmarks

Lütjens, B. (2023). Deep Learning Emulators for Accessible Climate Projections, Thesis, Massachusetts Institute of Technology.

Nguyen, T. et al., (2023). ClimaX: A foundation model for weather and climate, in ICML

Watson-Parris, D. et al. (2022). ClimateBenchv1.0: A Benchmark for Data-Driven Climate Projections, in JAMES

How to cite: Lütjens, B., Selin, N., Souza, A., Geogdzhayev, G., Newman, D., Giani, P., Tebaldi, C., Watson-Parris, D., and Ferrari, R.: Is linear regression all you need? Clarifying use-cases for deep learning in climate emulation, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10759,, 2024.

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 17 Apr 2024, no comments