EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Data-Driven Cloud Cover Parameterizations

Arthur Grundner1,2, Tom Beucler3, Pierre Gentine2, Marco A. Giorgetta4, Fernando Iglesias-Suarez1, and Veronika Eyring1,5
Arthur Grundner et al.
  • 1Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany (
  • 2Center for Learning the Earth with Artificial Intelligence And Physics (LEAP), Columbia University, New York, NY, USA
  • 3Institute of Earth Surface Dynamics, University of Lausanne, Lausanne, Switzerland
  • 4Max Planck Institute for Meteorology, Hamburg, Germany
  • 5University of Bremen, Institute of Environmental Physics (IUP), Bremen, Germany

A promising approach to improve cloud parameterizations within climate models, and thus climate projections, is to train machine learning algorithms on storm-resolving model (SRM) output. The ICOsahedral Non-hydrostatic (ICON) modeling framework permits simulations ranging from numerical weather prediction to climate projections, making it an ideal target to develop data-driven parameterizations for sub-grid scale processes. Here, we systematically derive and evaluate the first data-driven cloud cover parameterizations with coarse-grained data based on ICON SRM simulations. These parameterizations range from simple analytic models and symbolic regression fits to neural networks (NNs), populating a performance x complexity plane. In most models, we enforce sparsity and discourage correlated features by sequentially selecting features based on the models' performance gains. Guided by a set of physical constraints, we use symbolic regression to find a novel equation to parameterize cloud cover. The equation represents a good compromise between performance and complexity, achieving the highest performance (R^2>0.9) for its complexity (13 trainable parameters). To model sub-grid scale cloud cover in its full complexity, we also develop three different types of NNs that differ in the degree of vertical locality they assume for diagnosing cloud cover from coarse-grained atmospheric state variables. Using the game-theory based interpretability library SHapley Additive exPlanations, we analyze our most non-local NN and identify an overemphasis on specific humidity and cloud ice as the reason why it cannot perfectly generalize from the global to the regional coarse-grained SRM data. The interpretability tool also helps visualize similarities and differences in feature importance between regionally and globally trained NNs, and reveals a local relationship between their cloud cover predictions and the thermodynamic environment. Our results show the potential of deep learning and symbolic regression to derive accurate yet interpretable cloud cover parameterizations from SRMs.

How to cite: Grundner, A., Beucler, T., Gentine, P., Giorgetta, M. A., Iglesias-Suarez, F., and Eyring, V.: Data-Driven Cloud Cover Parameterizations, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-6306,, 2023.