Causal discovery from equation discovery

Gustau Camps-Valls; Roger Guimerà; Gherardo Varando; Emiliano Diaz; Kai-Hendrik Cohrs; Marta Sales-Pardo

doi:https://doi.org/10.5194/egusphere-egu26-13632

[Back] [Session ITS1.8/CL0.2]

EGU26-13632, updated on 14 Mar 2026

https://doi.org/10.5194/egusphere-egu26-13632

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Causal discovery from equation discovery

Gustau Camps-Valls¹, Roger Guimerà², Gherardo Varando¹, Emiliano Diaz¹, Kai-Hendrik Cohrs¹, and Marta Sales-Pardo²

Gustau Camps-Valls et al.

¹Image Processing Laboratory (IPL), Universitat de València, València, Spain (gustau.camps@uv.es)
²Universitat Rovira i Virgili, Tarragona, Spain

Reliable causal inference is a central challenge in Earth and climate sciences: observational records are limited, interventions are rare or impossible, and process representations in models rely on parametrizations that can introduce strong asymmetries between variables and the causal mechanisms [1,2]. Leveraging these asymmetries, rather than treating them as nuisances, can offer a principled route to causal discovery that is directly aligned with scientific modeling practice [2].

We address bivariate causal discovery from the standpoint of equation discovery using the Bayesian Machine Scientist (BMS) framework [3]. Our key contribution is to formalize the theoretical link between Symbolic Regression (SR) and Algorithmic Information Theory (AIT) via the Minimum Description Length (MDL) principle: the more plausible causal direction is the one that admits a shorter joint description in terms of a mechanism plus independent inputs [4]. Building on this connection, we characterize the mathematical properties of the resulting causal criterion, including identifiability and asymptotic consistency, and we analyze the role of core assumptions—most notably the Principle of Independent Causal Mechanisms (ICM)—in the context of geophysical data and climate-model parametrizations [5].

We demonstrate the approach on simulated benchmarks and on real Earth-system examples covering both i.i.d. settings and time-series climate data. The results illustrate when and why asymmetric parametrizations help disambiguate causal direction, and they provide a practical pathway to turn discovered governing equations into testable causal hypotheses for Earth and climate science.

References

[1] Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Elements of causal inference: foundations and learning algorithms. The MIT Press, 2017.

[2] Gustau Camps-Valls, Andreas Gerhardus, Urmi Ninad, Gherardo Varando, Georg Martius, Emili Balaguer-Ballester, Ricardo Vinuesa, Emiliano Diaz, Laure Zanna, and Jakob Runge. Discovering causal relations and equations from data. Physics Reports, 1044:1–68, 2023.

[3] Roger Guimera, Ignasi Reichardt, Antoni Aguilar-Mogas, Francesco A. Massucci, Manuel Miranda, Jordi Pallares y Marta Sales-Pardo. A Bayesian machine scientist to aid in the solution of challenging scientific problems. Science Advances, 6(5):eaav6971, 2020.

[4] Dominik Janzing, Joris Mooij, Kun Zhang, Jan Lemeire, Jakob Zscheischler, Povilas Daniūsis, Bastian Steudel und Bernhard Schölkopf. Information-geometric approach to inferring causal directions. Artificial Intelligence, 182:1–31, 2012.

[5] Sascha Xu, Sarah Mameche, and Jilles Vreeken. Information-theoretic causal discovery in topological order. In The 28th International Conference on Artificial Intelligence and Statistics, 2025.

How to cite: Camps-Valls, G., Guimerà, R., Varando, G., Diaz, E., Cohrs, K.-H., and Sales-Pardo, M.: Causal discovery from equation discovery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13632, https://doi.org/10.5194/egusphere-egu26-13632, 2026.