ITS1.11/NP4.2 | Many shades of causality analysis in Earth Sciences: Methods, challenges and applications
EDI
Many shades of causality analysis in Earth Sciences: Methods, challenges and applications
Convener: Milan Palus | Co-conveners: Aditi Kathpalia, Marlene KretschmerECSECS, Evgenia GalytskaECSECS, Rebecca HermanECSECS, Fernando Iglesias-SuarezECSECS, Stéphane Vannitsem
Orals
| Thu, 18 Apr, 16:15–18:00 (CEST)
 
Room N2
Posters on site
| Attendance Thu, 18 Apr, 10:45–12:30 (CEST) | Display Thu, 18 Apr, 08:30–12:30
 
Hall X3
Posters virtual
| Attendance Thu, 18 Apr, 14:00–15:45 (CEST) | Display Thu, 18 Apr, 08:30–18:00
 
vHall X3
Orals |
Thu, 16:15
Thu, 10:45
Thu, 14:00
Scientific disciplines strive to explain the causes of observed phenomena. In Earth sciences, in particular in climate research, the notion of causality is discussed and understood from several different points of view. Hannart et al. (BAMS, 2016), following Judea Pearl, state that “Causal counterfactual theory provides clear semantics and sound logic for causal reasoning and may help foster research on, and clarify dissemination of, weather and climate-related event attribution.” Changing focus from explanation of single events to understanding phenomena evolving in time, represented by time series, causality can be understood in terms of improved predictability, as proposed by Norbert Wiener and formulated for time series by C.W.J. Granger. Granger causality has been further generalized for nonlinear systems using methods rooted in information theory. Extensions from bivariate to multivariate time series can also point to indirect causations. X. S. Liang and R. Kleeman derive formulas for information flows based on dynamical equations. The Wiener-Granger concept of improved predictability has been translated into computer science as compressibility changes in effect data due to knowledge of cause data. The information-theoretic formulation of Granger causality and other methods have recently been adapted for complex systems with multiple time scales and/or heavy-tailed probability distributions and extreme events. Methods for turning multivariate data into causal graphs based on Bayesian reasoning and machine learning are also intensively applied in the Earth sciences.

The session welcomes contributions discussing these diverse approaches to causality analysis in Earth sciences, with an emphasis on comparative discussions. Learning causal relationships from Earth system data is vital for understanding complex dynamics, predicting changes, and informing strategies. This session invites innovative approaches and case studies employing causal inference techniques across Earth sciences, fostering interdisciplinary discussions and encouraging the development of robust causal analysis frameworks. Topics include causal discovery methods, causal effect estimation, applications of causal inference to climate change, causal modeling, network analysis, and addressing challenges and limitations in applying causal inference to Earth system science.

Orals: Thu, 18 Apr | Room N2

Chairpersons: Milan Palus, Evgenia Galytska, Rebecca Herman
16:15–16:20
16:20–16:30
|
EGU24-20548
|
ITS1.11/NP4.2
|
solicited
|
Highlight
|
On-site presentation
Michael Ghil, Alberto Carrassi, and Olivier de Viron

Causal inference is at the heart of the scientific method as usually practiced. Still, Karl Popper (The Logic of Scientific Discovery, 1935/1959)  tells us that a theory in the empirical sciences can never be proven: it can only be falsified, meaning that it can, and should, be scrutinized with decisive experiments. Even so, nobody that I know writes or publishes papers to disprove one’s own theory, only an opposing theory. And the debate rages on.

At the heart of this session lies the question of whether, and how, one can prove, rather than just disprove, a causal link between phenomena in the empirical sciences. The session deals specifically with statistical, as opposed to dynamical methods. These methods have the advantage that they are essentially indifferent to any laws of, or other accumulated heuristic ideas on, the field to which they are being applied: whether the time series one considers are from the environmental sciences, biology or medicine does not matter, only their length and accuracy does.

Judea Pearl (e.g., Stat. Surveys, 2009) made an important observation on how to transcend the saying that “Correlation is not causation” by pointing out that standard methods of statistical analysis rely on the stationarity hypothesis of the phenomena being examined. Crucial questions, however, like the causal role of anthropogenic forcing in climate change, deal precisely with the causes of nonstationarity. In particular, Pearl suggested counterfactual analysis as an essential approach in establishing criteria for the necessary and sufficient character of a given cause for a given phenomenon. Thus, the common approach of detection and attribution in the climate sciences only covers the sufficiency aspect of anthropogenic forcing, and more can be done (Hannart et al., BAMS, 2016; Clim. Change, 2016).

The present talk will cover four specific aspects of these broad issues: (i) the distinction between information transfer, including both linear correlations and nonlinear extensions thereof, and true causation; (ii) the divergent results of some widely, and not so widely, used methods of studying information transfer (Krakovska et al., PRE, 2018; Kossakowski et al., Psychol. Methods, 2021; Delforge et al., HESS, 2022); (iii) shared variability of climatic time series (De Viron, GRL, 2013; ); and (iv) the uses of data assimilation in applying counterfactual theory to nonstationary phenomena (Carrassi, QJRMS, 2017; Metref et al., QJRMS, 2019).

Conclusions will include the obvious one that statistical studies of causal inference have to be complemented by dynamical ones.

How to cite: Ghil, M., Carrassi, A., and de Viron, O.: Some Thoughts on Causal Inference, the Scientific Method, and Data Assimilation, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20548, https://doi.org/10.5194/egusphere-egu24-20548, 2024.

16:30–16:40
|
EGU24-2948
|
ITS1.11/NP4.2
|
solicited
|
Highlight
|
Virtual presentation
Sebastian Engelke

The talk discusses a critical topic in climate science: understanding how interventions on our climate system influence the likelihood of extreme events. The focus is on methodologies that enable causal attribution of such events to specific drivers, rather than merely predicting their occurrence. We discuss common practices and highlight the use of recent statistical methods that are applicable when only observational data is available, as opposed to model-based data. The talk defines the concept of a causal effect of a treatment (such as changes in flood infrastructure or increased CO2 emissions) on extreme outcomes (like a one in 100 year flood). We also cover the estimation of these effects amidst confounding factors and the assessment of associated uncertainties. Finally, we discuss the inherent challenges of applying causal inference to extreme climate events. 

How to cite: Engelke, S.: Causal methods for climate extremes, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2948, https://doi.org/10.5194/egusphere-egu24-2948, 2024.

16:40–16:50
|
EGU24-6535
|
ITS1.11/NP4.2
|
On-site presentation
Rui A. P. Perdigão

The present communication provides a contribution to an overarching cross-methodological causality investigation, encompassing a methodological synergy among physical, analytical, information-theoretic and systems intelligence approaches to causal discovery and quantification in complex system dynamics. These efforts methodologically lead to the emergence of a broader causal framework, valid not only in classical recurrence-based dynamical systems, but also on the generalized information physics of non-ergodic coevolutionary spatiotemporal complexity.

This study begins with a comprehensive cross-examination of causality metrics derived from these diverse domains, by synthesizing causality insights from information theory, which enables the quantification of information flow among variables; differential geometry, which captures the curvature and structure of causal relationships; dynamical systems, which analyze the temporal evolution of systems and associated kinematic geometric properties; and fundamental physical metrics, which elucidate causal connections in the physical world from fundamental thermodynamic principles. Through this analysis, we aim to deepen our understanding of causality in complex systems, with physical process understanding and geophysical applications in mind.

Having laid out some of the key methodological flavours of causality, the present communication introduces new metrics further contributing to a broader non-Shannonian information theoretic causality pool of methods, along with additional advances on quantum thermodymamical, nonlinear statistical mechanical, differential geometric and topologic approaches on causality. Positioning ourselves in a broader nonlinear non-Gaussian non-ergodic setting to tackle far-from-equilibrium structural-functional coevolution and synergistic emergence in complex system dynamics, our derivations further contribute to a new generation of information theoretic, dynamical systems and non-equilibrium thermodynamic causality approaches, along with their synergistic articulation towards a unified framework. This brings out further cross-methodological comparability, portability and complementary insights on dealing with the intricate causality of complex multiscale far-from-equilibrium Earth system dynamic phenomena.

By unveiling manifold flavours of causality and their interconnections, this study brings out their commonalities, synergies, and further potential synergistic applications across disciplines. This interdisciplinary approach not only enhances our theoretical understanding of causality but also provides practical implications for applications in fields such as data science, network theory, and complex systems analysis, with direct relevance across the Earth system sciences and beyond.

How to cite: Perdigão, R. A. P.: Unfolding the Manifold Flavours of Causality, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6535, https://doi.org/10.5194/egusphere-egu24-6535, 2024.

16:50–17:00
|
EGU24-12204
|
ITS1.11/NP4.2
|
On-site presentation
András Telcs

Causal inference is indeed a challenging endeavor, particularly when applied to observational studies of interacting systems. Perl's theory, along with the PC algorithm on directed acyclic graphs, and its extensions PCMCI and FCI, are powerful tools. However, their application to time series is time-consuming, and they still struggle to distinguish Markov-equivalent scenarios.

In our talk, we will present some methods based on principles that are partly or fully different from those underlying the aforementioned tools. Due to time constraints, we will focus on the main principles that allow the discovery of causal relations between a pair of systems, including hidden common causes (referred to as common drivers or confounders in different schools of thought). We won't delve into the numerous technical challenges due to the time limit.

How to cite: Telcs, A.: Some alternative metods for causal discovery, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12204, https://doi.org/10.5194/egusphere-egu24-12204, 2024.

17:00–17:10
|
EGU24-4315
|
ITS1.11/NP4.2
|
On-site presentation
Carlos Pires, Stéphane Vannitsem, and David Docquier

We present a general theory for computing and estimating Shannon entropy-based information transfer in nonlinear stochastic systems driven by deterministic forcings and additive and/or multiplicative noises, by extending the Liang-Kleeman framework of causality inference to nonlinear cases. The method presents effective and computable formulas of the rates of information transfer between sets of causal and consequential system variables, relying on the evaluation of conditional expectations of the deterministic and stochastic forcings (Causal Sensitivity Method: CSM). The CSM can work with a) ensemble model runs, b) system time series in ergodic conditions and c) time series without a priori knowledge of model equations. The CSM also allows to express the information transfer parcels, which are attributable either to one-to-one interactions or to synergies across groups of variables and assess where the information is more relevant in the state space. The CSM is tested in two proof-of-concept low-order models: 1) a nonlinear model derived from a potential function and 2) the classical chaotic Lorenz model, both forced by additive and/or multiplicative noises. The CSM is also tested with a nonlinear regression model of the ice-cover time evolution, forced by radiation. The CSM estimation is much more robust and efficient than methods using the stochastic model’s full probability density function and its derivatives, whose estimation is rather unreliable in case of short data availability. The analysis also demonstrates that the CSM estimation is computationally cheap in the different experiments, providing evidence of the possibilities and generalizations offered by the method, thus opening new perspectives on real-world applications. This work was funded by the Portuguese Fundação para a Ciência e a Tecnologia (FCT) I.P./MCTES through national funds (PIDDAC) – UIDB/50019/2020(https://doi.org/10.54499/UIDB/50019/2020),UIDP/50019/2020(https://doi.org/10.54499/UIDP/50019/2020) and LA/P/0068/2020 (https://doi.org/10.54499/LA/P/0068/2020) and the project  JPIOCEANS/0001/2019 (ROADMAP).

 

How to cite: Pires, C., Vannitsem, S., and Docquier, D.: Evaluation of Shannon Entropy-based Information transfer in nonlinear systems , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4315, https://doi.org/10.5194/egusphere-egu24-4315, 2024.

17:10–17:20
|
EGU24-2618
|
ITS1.11/NP4.2
|
Highlight
|
Virtual presentation
X. San Liang

Causality analysis is an important and old problem lying at the heart of scientific research. Causality analysis based on data, however, is a relatively recent development. Traditionally causal inference has been classified as a field in statistics. Motivated by the predictability problem in physical science, it is found that causality in terms of information flow/transfer is actually a real notion in physics that can be derived ab initio, rather than axiomatically proposed as an ansatz, and, moreover, can be quantified. A comprehensive study with generic systems (both deterministic and stochastic) has just been fulfilled, with explicit formulas attained in closed form (Liang, 2016). These formulas are invariant upon nonlinear coordinate transformation, indicating that the so-obtained information flow should be an intrinsic physical property. The principle of nil causality that reads, an event is not causal to another if the evolution of the latter is independent of the former, which all formalisms seek to verify in their respective applications, turns out to be a proven theorem here. In the linear limit, its maximum likelihood estimator is concise in form, involving only the commonly used statistics, i.e., sample covariances. An immediate corollary is that causation implies correlation, but the converse does not hold, expressing the long standing philosophical debate ever since Berkeley (1710) in a transparent mathematical expression.

The above rigorous formalism has been validated with benchmark systems like baker transformation, Hénon map, stochastic gradient system, and with causal networks in extreme situations such as those buried in heavy noises and those with nodes almost synchronized (e.g., Liang, 2021), to name a few. They have also been applied to real world problems in the diverse disciplines such as climate science, dynamic meteorology, turbulence, neuroscience, financial economics, quantum mechanics, etc., with interesting new findings. For example, Stips et al. (216) found that, while CO2 emission does drive the recent global warming, on a paleoclimate scale, it is global warming that drives the CO2 emission; PNA, a teleconnection pattern related to the inclement weather in North America, may trace a part of its origin to a rather limited local marginal sea far away in Asia. Besides, with the above causality analysis, pollution sourcing (particularly PM2.5) may be conducted in a rather effective way via causal graph reconstruction. If time permits, I will also present an ongoing application to the development of causal AI algorithms to overcome the interpretability crisis, and a recent remarkable exercise with such an algorithm in the forecasting of El Niño Modoki, a climate mode linked to hazards in far-flung regions of the globe.

 

References:

Liang, 2014: Unraveling the cause-effect relation between time series. Phy. Rev. E,  90, 052150.

Liang, 2016: Information flow and causality as rigorous notions ab initio. Phys. Rev. E, 94, 052201.

Liang, 2021: Normalized multivariate time series causality analysis and causal graph reconstruction. Entropy, 23, 679.

Liang et al., 2021: El Niño Modoki can be mostly predicted more than 10 years ahead of time. Nature Sci. Rep. 11:17860

 

How to cite: Liang, X. S.: Causality as a real physical notion ab initio, and its applications in Earth system sciences, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2618, https://doi.org/10.5194/egusphere-egu24-2618, 2024.

17:20–17:30
|
EGU24-20353
|
ITS1.11/NP4.2
|
On-site presentation
Adolf Stips, San Liang, and Diego Macias-Moy

Stips et al (2016) demonstrated the existing causal relationship between Green House Gases (GHG) concentrations and Global Mean Surface Temperature (GMTA) based on the Information Flow (IF) methodology. Critics on the application of the Information Flow concept as developed by Liang (2008, 2016) has focused on the underlying assumption of uncorrelated residuals (noise) between the time series. However, this assumption can only make sense for a system with two components, as for a multi-dimensional system unobserved noise may well exist. Fundamentally, there can be no such thing like correlated noise at all. It can seemingly only appear because of some hidden process(es). For investigating this in detail a multivariate information flow analysis has been developed. We will show that in our tests using processes with correlated noises, the preset causalities can be well reproduced. Further, it will be demonstrated that reducing autocorrelation within the time series by pre-whitening, confirms the achieved causality directions. Finally, we question the validity of the proposed alternative measure using forecast error variance decomposition based on vector autoregression by Goulet and Goebel (2021), because in their method causal directions can be simply reversed by reordering.  A physically faithful causal measure should be generally independent of ordering.

 

Coulombe, P. G. and Goebel, M. 2021. On Spurious Causality, CO2, and Global Temperature.  Econometrics9(3), 33.

Liang, X. S. 2008. Information Flow within Stochastic Dynamical System. Phys. Rev. E 78: 031113.

Liang, X. S. 2016. Information Flow and Causality as rigorous Notions ab initio. Physical Review E 94: 05220.

Stips, A., D. Macias, C. Coughlan, E. Garcia-Gorriz, and X. S. Liang. 2016. On the Causal Structure between CO2 and Global Temperature. Scientific Reports 6: 21691.

How to cite: Stips, A., Liang, S., and Macias-Moy, D.: A Reply to “On Spurious Causality, CO2, and Global Temperature”, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20353, https://doi.org/10.5194/egusphere-egu24-20353, 2024.

17:30–17:40
|
EGU24-17312
|
ITS1.11/NP4.2
|
ECS
|
On-site presentation
Jordi Cerdà-Bautista, José María Tárraga, Vasileios Sitokonstantinou, and Gustau Camps-Valls

In a world where climate change is rapidly accelerating, droughts are becoming more frequent and severe, posing a serious challenge to food security in the most vulnerable regions. The Horn of Africa has witnessed a rise in acute malnutrition, affecting 6.5 million people in 2022 [1]. Prolonged dry spells significantly contribute to this crisis [2], yet it is crucial to recognize that droughts are not the sole driver. Various factors, including hydrological conditions, food production capabilities, market access, insufficient humanitarian aid, conflicts, and displacement, play a significant role [3,4]. Understanding the underlying causes of food insecurity is pivotal for improving the effectiveness of humanitarian actions, yet in this context, the study proves to be complex, involving multiple variables, scales, and non-linear relationships. Predictive Machine Learning (ML) techniques are not suited to understanding the causes and estimating the causal effect by default [5,6], instead, this study focuses on causal inference to quantify the impacts of climate and socioeconomic factors on food insecurity. Our key contributions involve discerning causal relationships within the intricate food security system, integrating a comprehensive database including socio-economic, weather and remote sensing data, and estimating the causal effect of humanitarian interventions on the food security index, the outcome of interest. The causal discovery task is performed via time series methods accounting for nonlinear and nonstationary relations, like the PCMCI algorithm and nonlinear Granger causality [7,8], identifying the drivers in the data that are causally linked to the outcome. Besides, the causal effect estimation task is performed via a Conditional Average Treatment Effect (CATE), gaining insights into the spatiotemporal heterogeneity of the impact of humanitarian interventions on the outcome [9]. Such endeavors are crucial for facilitating more efficient future interventions and policies, thereby improving transparency and accountability in humanitarian aid.

References

[1] WFP, “Impacts of the Cost of Inaction on WFP Food Assistance in Eastern Africa (2021 & 2022),” https://docs.wfp.org/api/documents/WFP-0000148305/download/, 2023.

[2] Coughlan de Perez E., et al, “From rain to famine: assessing the utility of rainfall observations and seasonal forecasts to anticipate food insecurity in East Africa,” Food Secur., vol. 11, no. 1, pp. 57–68, 2019.

[3] Maxwell D. et al, “Viewpoint: Determining famine: Multi-dimensional analysis for the twenty-first century,” Food Policy, vol. 92, 2020.

[4] Guy A. J. et al, “Climate, conflict and forced migration” Global Environmental Change, vol. 54, no. 4, 2019.

[5] Pearl J., “Causality: Models, reasoning, and inference,” Cambridge University Press, vol. 19, 2000.

[6] Peters J., Janzing D., and Schlkopf B., Elements of Causal Inference: Foundations and Learning Algorithms, The MIT Press, 2017.

[7] Runge, J.. "Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets." Conference on Uncertainty in Artificial Intelligence. PMLR, 2020.

[8] Camps-Valls, G. et al, “Discovering causal relations and equations from data”, Physics Reports 1044 :1--68, 2023

[9] Giannarakis, G. et al, (2022). Personalizing sustainable agriculture with causal machine learning. arXiv preprint arXiv:2211.03179.

How to cite: Cerdà-Bautista, J., Tárraga, J. M., Sitokonstantinou, V., and Camps-Valls, G.: Causal evaluation of humanitarian aid on food security, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17312, https://doi.org/10.5194/egusphere-egu24-17312, 2024.

17:40–17:50
|
EGU24-19242
|
ITS1.11/NP4.2
|
ECS
|
On-site presentation
Germain Bénard, Marion Gehlen, and Mathieu Vrac

Time series of in situ observations and remote sensing data suggest variability in epipelagic ecosystems at seasonal to multiannual time scales. These go along with changes in physical-biogeochemical conditions. While a consensus exists on the proximate causes of observed ecosystem variability (e.g. mixed layer variability, availability of nutrients, grazing pressure), the role of large-scale drivers (e.g. natural climate modes) still needs to be better understood. Moreover, differences in the implementation of marine ecosystem processes exist among Earth System Models, and it is important to understand the uncertainty around the representation of specific interactions via inter-model comparison.

We use output from 5 multi-centennial Earth system model simulations under pre-industrial climate to identify modes of low-frequency biogeochemical properties and the importance of individual drivers. The study focuses on the North Atlantic subpolar gyre (NASPG), a region of high primary productivity and considerable observed natural variability in physical and biogeochemical conditions. We explore causality between modes of climate variability, ocean physics and biogeochemistry by applying a Knowledge-Data-Discovery method, PCMCI. This method enables causal links with a potential time lag to be established between different domains. It proposes a novel way for the comparison of differences between model dynamics.

First, six geographic subregions are identified, based on their physical-biogeochemical characteristics (e.g. deep convection zones, intensity of spring bloom), followed by by the selection of physical and biogeochemical variables. These variables are the maximum winter mixed layer depth due to the role in supplying nutrients to the surface fueling the spring bloom, the North Atlantic Oscillation (NAO), a dominant natural mode climate variability, for its contribution to sea surface temperature (SST) and nutrient variability in the subpolar gyre, and the Gyre Strength, an index reflecting the response of the NASPG to wind forcing. We focus on one micronutrient (Iron) and one macronutrient (Nitrate). They were chosen because both can limit the primary production in this region. 

Next, PCMCI is applied to search for the temporal relationships (potentially lagged) between different regions and variables. These relationships are computed from partial correlations which, for gaussian distributed data, is equivalent to a causal link. The application of this method allows networks of causality to be identified, highlighting drivers of nutrient variability under varying natural climate forcing. The approach enables the quantification of intermodel differences either by analyzing one link after another or by looking directly at the entire causal graphs with a newly proposed method to quantify the dissimilarity between two models.

This method verified expected interactions such as the role of mixed layer depth for nutrient supply and quantified the strength of this interaction across the models. It also highlighted model-specific dynamics such as the role of temperature (via sea-ice formation) for iron in two biogeochemical models out of 5. 



 

How to cite: Bénard, G., Gehlen, M., and Vrac, M.: Multi-model comparison of causal relationships between atmospheric and marine biogeochemical variables, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19242, https://doi.org/10.5194/egusphere-egu24-19242, 2024.

17:50–18:00
|
EGU24-21883
|
ITS1.11/NP4.2
|
Highlight
|
On-site presentation
Gustau Camps-Valls, Kai-Hendrik Cohrs, Emiliano Diaz, Vasileios Sitokonstantinou, and Gherardo Varando

Causality is essential for understanding complex systems like the Earth and climate, where a plethora of intertwined variables and processes happen in the wild. Constructing causal graphs often relies on either data-driven or expert-driven approaches, both fraught with challenges. The former methods, like the celebrated Peter-Clark (PC) algorithm, face issues with data requirements and assumptions of causal sufficiency, while the latter demand substantial time and expertise.

This work explores the capabilities of Large Language Models (LLMs) as an alternative to domain experts for causal graph generation. We frame conditional independence queries as prompts to LLMs and employ the PC algorithm with the answers. The performances of the LLM-based conditional independence oracle on systems with known causal graphs show a high degree of variability. We improve the performance through a proposed statistical-inspired voting schema that allows control over false-positives and false-negatives rates. We apply our chatPC algorithm to understand the causal relations between complex sets of variables (social, economic, conflicts, environmental, and climatic factors) in two pressing problems: population displacement and food insecurity in Africa. We find plausible graphs as corroborated by experts in the humanitarian sector, finding traces of causal reasoning in the model's answers. We posit that LLM-based causality is a new, promising, alternative avenue for automated causality, especially indicated for rapid response and data-scarce regimes.

How to cite: Camps-Valls, G., Cohrs, K.-H., Diaz, E., Sitokonstantinou, V., and Varando, G.: Large Language Models for Causal Discovery in the Earth Sciences, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-21883, https://doi.org/10.5194/egusphere-egu24-21883, 2024.

Posters on site: Thu, 18 Apr, 10:45–12:30 | Hall X3

Display time: Thu, 18 Apr 08:30–Thu, 18 Apr 12:30
Chairpersons: Marlene Kretschmer, Rebecca Herman, Stéphane Vannitsem
X3.16
|
EGU24-8450
|
ITS1.11/NP4.2
Milan Palus

Many approaches to infer causal relations from time series in Earth sciences have been proposed and applied in order to identify diverse interactions, such as the influence of large-scale circulation modes on local temperature and precipitation, variability of Euroasian winters due to changing Arctic Sea ice cover, or interactions of solar activity and interplanetary medium conditions with the Earth’s magnetosphere-ionosphere systems. The methods usually depend on “dimensions” in which the understanding of underlying phenomena is located: The phenomena or processes can be linear or nonlinear; deterministic, or random. The third abstract “dimension” is the actual dimensionality of the problem, given either by the dimension of the state space of the underlying mechanism or the number of involved variables. We will conduct a short flight inside these “dimensions,” shedding light on some of the shades, comparing some of the causality inference methods using model and real data from the Earth sciences.

This study was supported by the Czech Academy of Sciences, Praemium Academiae awarded to M. Paluš and the Czech–Chinese Academies of Sciences Mobility Plus Project NSFC-23-08.

How to cite: Palus, M.: Many shades in three dimensions and parallel universes of causality analysis, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8450, https://doi.org/10.5194/egusphere-egu24-8450, 2024.

X3.17
|
EGU24-4751
|
ITS1.11/NP4.2
Assesing causal dependencies between climate indices using pseudo transfer entropy
(withdrawn)
Cristina Masoller, Riccardo Silini, Giulio Tirabassi, Marcelo Barreiro, and Laura Ferranti
X3.18
|
EGU24-14429
|
ITS1.11/NP4.2
|
ECS
Aditi Kathpalia, Pouya Manshour, and Milan Paluš

Many approaches to time series causality exist and have been inspired from fields such as statistics, information theory, physics and topology. We have proposed a method called compression-complexity causality (CCC) [1] inspired from the field of data compression in computer science. It is based on the idea that the compressibility of the ‘effect’ time series changes when the ‘cause’ time series is considered in the evolution of the future dynamics of the effect. Compressibility is estimated using compression-complexity estimator for time series called ‘effort-to-compress’, which employs a lossless data compression algorithm for complexity estimation. CCC makes minimal assumptions on given time series data and has been seen to work well for short length data, irregularly sampled data as well as data with low temporal resolution. We have also introduced a multidimensional version of CCC, called Permutation CCC (PCCC) [2], which uses Takens’ embedding for appropriate high dimensional representation of time series. This representation is subsequently encoded using ordinal patterns before computation of CCC. PCCC formulation retains the original robustness of CCC. This is demonstrated with its application on simulated multidimensional systems. We apply this formulation to infer causality between CO2 emissions – temperature recordings on three different time scales, El Niño–Southern Oscillation phenomena – South Asian Summer Monsoon on two different time scales, as well as North Atlantic Oscillations – European temperature recordings on two different time scales. These paleoclimate and climate datasets suffer from the issues of missing samples, low temporal resolution and short length data and so a reliable inference of these climatic interactions requires a robust causality estimator.  
Finally, we explore another variation of CCC which can help to infer causality in the multivariate cases. This variation helps to infer the existence of causal influences to a particular variable (from each other variable considered) while conditioning out the additional variables present. The presence of causal influences to each variable is decided by choosing the model of least compression-complexity which can help to explain the evolution of the future of that particular variable. In case more than one model has least complexity, the smallest model is chosen. We apply this formulation to understand interactions in space-weather system, particularly the solar wind-magnetosphere-ionosphere system interactions, which manifest as geomagnetic storms and substorms. We compare the performance of CCC formulations with existing methods in case of simulations as well as real data applications. 

This study is supported by the Czech Academy of Sciences, Praemium Academiae awarded to M. Paluš.

References:
[1] Kathpalia, A., & Nagaraj, N. (2019). Data-based intervention approach for Complexity-Causality measure. PeerJ Computer Science, 5, e196.
[2] Kathpalia, A., Manshour, P., & Paluš, M. (2022). Compression complexity with ordinal patterns for robust causal inference in irregularly sampled time series. Scientific Reports, 12(1), 14170.

How to cite: Kathpalia, A., Manshour, P., and Paluš, M.: Compression-complexity based estimation of Causality: Applications in Earth and Climate Sciences, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14429, https://doi.org/10.5194/egusphere-egu24-14429, 2024.

X3.19
|
EGU24-20089
|
ITS1.11/NP4.2
|
ECS
Marcell Stippinger, Attila Bencze, Ádám Zlatniczki, Zoltán Somogyvári, and András Telcs

Exploring causal relationships among stochastic dynamic systems based solely on observed time series of their states poses a challenging problem. In this context, we present a novel method for causal discovery within stochastic dynamic systems, specifically designed to overcome the limitations of existing methods, particularly in detecting hidden and common drivers. Our proposed approach is based on a straightforward observation: a process generated by a stochastic dynamical system follows a Markov chain if and only if all external influences are independent and identically distributed (i.i.d.). Consequently, the primary tool in our proposed causal discovery scheme involves testing whether the process generates a Markov chain, as opposed to relying on the "classical" causal Markov property or d-separation.

Our method is nonparametric, requiring no intervention, and is built on a reasonably small number of assumptions. We tested our model both on simulated Markov chains of finite state space and structural vector autoregressive processes. To demonstrate the efficacy of our model, we apply it to weather data consisting of solar irradiation and daily average air temperature. Through our method, we successfully identify the ground truth, revealing that irradiation drives temperature. Furthermore, we adeptly pinpoint the true lag while eliminating spurious lags in the autocorrelation function.

How to cite: Stippinger, M., Bencze, A., Zlatniczki, Á., Somogyvári, Z., and Telcs, A.: Causal Discovery of Stochastic Dynamical Systems: A Markov Chain Approach, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20089, https://doi.org/10.5194/egusphere-egu24-20089, 2024.

X3.20
|
EGU24-4693
|
ITS1.11/NP4.2
|
ECS
|
Juraj Bodik

Granger causality plays a pivotal role in uncovering directional relationships among time-varying variables and enhancing decision-making in complex systems. While this notion gains heightened importance during extreme events in highly volatile periods,
state-of-the-art methods primarily focus on causality within the body of the distribution. We introduce a new rigorous mathematical framework for “Granger causality in tail,” designed to evaluate whether an extreme event in one time series causes a corresponding extreme event in another. Moreover, we describe how we can quantify the magnitude of the causal impact of an extreme event on other variables. 

We establish equivalences between our Granger causality in tail and other causal concepts, including “classical Granger causality,” “Sims causality,” and “structural causality.” By proving the key properties of Granger causality in tail, we assert its usefulness in high-dimensional complex systems with potential hidden confounders. Here, to model the tails of the variables, we utilize the “extreme value theory” framework. We also propose an inference method for detecting the presence of Granger causality in tail and provide insights into the asymptotic properties of our estimator within the framework of a stochastic recurrence equation (SRE) model.

How to cite: Bodik, J.: Granger causality in tail, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4693, https://doi.org/10.5194/egusphere-egu24-4693, 2024.

X3.21
|
EGU24-22158
|
ITS1.11/NP4.2
|
ECS
Rebecca Herman and Jakob Runge

Causal discovery and effect estimation for time series provide scientists with a way to extract causal information from observational studies when possible. But the high dimensionality of raw climate data causes computational problems for most analysis methods, and causal inference is no exception. To address this problem, climate scientists usually pre-process climate data using dimension reduction techniques (including seasonal and regional averaging and principle component analysis) that may result in the loss of valuable information before the true analysis even begins. For example, climate scientists often represent El Niño Southern Oscillation variability (ENSO) using the uni-variate Nino3.4 index, which cannot distinguish between central Pacific and eastern Pacific El Niño events, which are believed to impact global climate varaibility in different ways. This study introduces a method for avoiding premature data dimension reduction in causal effect estimation, implemented in tigramite. The method allows the researcher to define multi-variate climate indices, reducing the dimensionality of the causal inference problem via the causal assumptions instead of losing information from the data itself. To investigate the performance of this approach on climate data, we examine the effect of ENSO on the North Atlantic Oscillation (NAO) in simulated data from the Coupled Model Intercomparison Project, phase 6. We choose this as our case study because different types of El Nino are believed to have very different effects on NAO, to the extent that the impact may be completely undetectable in observations when no distinction between the types of ENSO is made. By comparing estimated effects using uni- and multi-variate climate indices, we demonstrate that this method retains valuable information that would be lost in uni-variate analysis, and make recommendations for best practices when using multi-variate climate indices in causal effect estimation.

How to cite: Herman, R. and Runge, J.: Spatiotemporal Causal Effect Estimation, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-22158, https://doi.org/10.5194/egusphere-egu24-22158, 2024.

X3.22
|
EGU24-1838
|
ITS1.11/NP4.2
|
David Docquier, Giorgia Di Capua, Reik V. Donner, Carlos A. L. Pires, Amélie Simon, and Stéphane Vannitsem

Correlation does not necessarily imply causation, and this is why causal methods have been developed to try to disentangle true causal links from spurious relationships. In our study, we use two causal methods, namely the Liang-Kleeman information flow (LKIF) and the Peter and Clark momentary conditional independence (PCMCI) algorithm, and apply them to four different artificial models of increasing complexity and one real-case study based on climate indices in the North Atlantic and North Pacific. We show that both methods are superior to the classical correlation analysis, especially in removing spurious links. LKIF and PCMCI display some strengths and weaknesses for the three simplest models, with LKIF performing better with a smaller number of variables, and PCMCI being best with a larger number of variables. Detecting causal links from the fourth model is more challenging as the system is nonlinear and chaotic. For the real-case study with climate indices, both methods present some similarities and differences at monthly time scale. One of the key differences is that LKIF identifies the Arctic Oscillation (AO) as the largest driver, while El Niño-Southern Oscillation (ENSO) is the main influencing variable for PCMCI. More research is needed to confirm these links, in particular including nonlinear causal methods.

How to cite: Docquier, D., Di Capua, G., Donner, R. V., Pires, C. A. L., Simon, A., and Vannitsem, S.: A comparison of two causal methods in the context of climate analyses, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1838, https://doi.org/10.5194/egusphere-egu24-1838, 2024.

X3.23
|
EGU24-12884
|
ITS1.11/NP4.2
|
ECS
Yasir Latif and Milan Palus

In 2022, La Niña and negative Indian Ocean Dipole (IOD) coincided, causing abnormally warm sea surface conditions in the eastern Indian Ocean (near Indonesia). This provided additional moisture to feed monsoon depressions, resulting in heavy rainfall in Pakistan. El Niño-Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) are two modes of sea surface temperature variability that can significantly impact precipitation in Pakistan's Upper Indus Basin. The current study used in situ observations and reanalysis ERA 5 precipitation data to determine the causal influence of ENSO and IOD on precipitation variability using an information-theoretic generalization of Granger causality. The predicted causal effect and causal delay obtained using conditional mutual information, a.k.a. transfer entropy, were further validated using conditional means ("composites") - precipitation means computed for different ENSO states; El Niño (positive), La Niña (negative), and neutral. Uncovering the causal and delayed effects of ENSO and IOD, as well as associated mechanisms, on subsequent precipitation in the UIB could provide a stronger foundation for improving seasonal climate predictions with a longer lead time, as well as understanding how regional and large-scale drivers affect regional precipitation.

This study was supported by the Czech Academy of Sciences, Praemium Academiae awarded to M. Paluš and the Czech–Chinese Academies of Sciences Mobility Plus Project NSFC-23-08.

How to cite: Latif, Y. and Palus, M.: Causal information flow and information transfer delay from ENSO and IOD to precipitation variability in the Upper Indus Basin, Pakistan, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12884, https://doi.org/10.5194/egusphere-egu24-12884, 2024.

X3.24
|
EGU24-15830
|
ITS1.11/NP4.2
|
ECS
Emma Schultz, Dim Coumou, and Michael Massmann

The El Niño-Southern Oscillation (ENSO) stands out as the dominant driver of climate fluctuations on interannual timescales. As ENSO causes extreme weather events in the Pacific region and beyond, it has wide ranging socio-economic impacts. Over the past decades, a strengthening in the temperature gradient is observed between the Western and Eastern Pacific. However, climate model simulations do not depict this strengthening trend. Here we explore if the Bjerknes feedback is well represented in climate models, and if not whether this could explain the discrepancy between the observed and modeled trends. The Bjerkness feedback represents the dominant feedback processes between atmosphere and ocean that drive ENSO variability. A causal discovery method, the PCMCI algorithm, is used to construct causal networks of key variables in the Bjerknes feedback: near surface temperatures, sea level pressure and trade winds across the Pacific Ocean. Causal networks are constructed for time periods 1950-1982 and 1982-2014, based on both reanalysis data and climate model simulations. The observed changes between causal networks based on the early and later period are examined. The analysis reveals a strengthening causal influence of trade winds on sea level pressure and temperatures in networks based on reanalysis data. This significant strengthening trend is absent in networks based on climate model simulations. As an increased influence of the trade winds would have a cooling effect on Central and Eastern Pacific, this might explain why there is no observed warming in the Central and Eastern Pacific over the past decades, and thus a strengthened temperature gradient. The lack of this strengthening causal influence of trade winds in climate models might thus explain why the models do show a warming over the Eastern Pacific, weakening the temperature gradient.

How to cite: Schultz, E., Coumou, D., and Massmann, M.: Validating ENSO Feedbacks in Climate Models Using a Causal Discovery Method, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15830, https://doi.org/10.5194/egusphere-egu24-15830, 2024.

X3.25
|
EGU24-15950
|
ITS1.11/NP4.2
|
ECS
Victoria M. H. Deman, Daniel F. T. Hagan, Damián Insua-Costa, Akash Koppa, and Diego G. Miralles

The semi-arid Sahel region has witnessed an increase in extreme weather conditions such as repeated drought cycles, desertification, heatwaves and floods in recent decades. These events pose existential threats to the already vulnerable population and natural ecosystem. Addressing the underexplored potential of subseasonal forecasting in the Sahel, data-driven models offer an alternative to traditional dynamical approaches. These models – distinguished by enhanced computational efficiency, reduced sensitivity to initial conditions, the ability to learn intricate relationships from data, and the ability to capture nonlinear dynamics – represent an asset in building resilience in the region. 

This study investigates the potential of employing a rigorous causality framework based on the Liang–Kleeman information flow for predictor selection. Previous research has underscored the pitfalls of using correlations for predictor selection when forecasting using machine learning models, as spurious correlations may lead to the selection of predictors without any physical connection. In response, our research investigates the potential of this information flow causality to select predictors within a vast array of predefined variables, including coupled ocean–atmospheric oscillation indices, sea-surface temperatures, vegetation indices and soil moisture. Subsequently, our focus is directed towards predicting summer maximum temperature extremes with lead times of 2, 4, 8 and 16 weeks using the selected predictors and a variety of deep learning techniques. Despite the challenge of predicting short-lived heatwaves in a region characterised by the high baseline temperatures, our results indicate that the information flow causality effectively reduces dimensionality, and enables a selection of features with causal relationships that facilitates subsequent forecasting. In the following, the causal knowledge from the predictor selection step will be quantitatively transferred into the machine learning models themselves, thereby providing an interpretable framework for the prediction of the hot extremes in the region. 

How to cite: Deman, V. M. H., Hagan, D. F. T., Insua-Costa, D., Koppa, A., and Miralles, D. G.: Leveraging Information Flow for Data-Driven Subseasonal Forecasting of Sahelian Hot Extremes, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15950, https://doi.org/10.5194/egusphere-egu24-15950, 2024.

X3.26
|
EGU24-14621
|
ITS1.11/NP4.2
Asher Samuel Bhatti, Daniel Fiifi Tawia Hagan, Guojie Wang, Waheed Ullah, Safi Ullah, Isaac K. Nooni, Solomon O. Y. Amankwah, and Feihong Zhou

Unveiling the complexities of Earth's climate system demands a profound understanding of the intricate interplay between soil moisture (SM) and 2m air temperature anomalies (T). This study employs cutting-edge methodologies, such as nonlinear Random Forest Granger Causality (nRFGC), Copula nonlinear Granger Causality (CnGC), nonlinear Kernel Granger Causality (nKGC), and the traditional linear Granger Causality (GC), to unravel the complex causal relationship between SM and T. Through extensive experimentation on both hypothetical and real-world datasets from diverse sources, the research underscores the superior efficacy of nonlinear methodologies—especially nonlinear KGC—in identifying elevated Granger causal signals across spatial and temporal dimensions. Notably, these findings align consistently with both hypothetical and traditional hydrological models, underscoring the limitations of linear methodologies when grappling with nonlinear causation. The study provides clarity on discerning the nonlinear causal link between SM and T, emphasizing the imperative to transcend conventional linear methods when unraveling the intricate causal complexities within Earth's climate system. Significantly, Nonlinear Granger Causality (nGC) emerges as a potent tool capable of unveiling causal structures without succumbing to overfitting issues. Consequently, this research imparts insightful revelations about the non-linear dynamics inherent in SM-T interactions across different seasonal levels. Our study clarifies and makes it clearer to identify the nonlinear causal relationship between SM and T, emphasizing the need to go beyond conventional methodologies, asserting that these methodologies fall short in comprehending the nuanced causal interactions between land and atmosphere. By urging a shift, the study contends that embracing non-linear methodologies is essential not only for enhancing predictions but also for gaining a more profound understanding of Earth's climate system.

How to cite: Bhatti, A. S., Hagan, D. F. T., Wang, G., Ullah, W., Ullah, S., Nooni, I. K., Amankwah, S. O. Y., and Zhou, F.: Unraveling the Complex Causal Relationship between Soil Moisture and Air Temperature Anomalies: A Comparative Analysis of Linear and Nonlinear Granger Causality, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14621, https://doi.org/10.5194/egusphere-egu24-14621, 2024.

X3.27
|
EGU24-971
|
ITS1.11/NP4.2
|
ECS
|
|
Vivek Kumar Yadav and Bramha Dutt Vishwakarma

The water availability in a region is driven by the water cycle, which is changing quickly in response to climate change and direct human interventions. The water cycle is defined and controlled by the variation in water fluxes such as Precipitation (P), Evapotranspiration (Et), Runoff (R), and Storage change (ΔS). Out of these water fluxes, ΔS is a key variable for ecosystem habitability and surviving droughts. It is an important parameter in drafting water management policy, but due to lack of long and reliable data the impact of climate change on ΔS is yet to be understood. The only Global observations of Terrestrial water storage (TWS) are available from GRACE satellite mission since 2002 at monthly scale.

Although GRACE data has transformed hydrological science significantly, its short time series limits usage of GRACE for climate change analysis of hydrological fluxes (closing the multidecadal water budget and sea level budget, understanding the spatiotemporal evolution of water availability, and so on). To tackle this, several studies have attempted reconstructing ΔS prior to GRACE period. These studies employ either hydrological modelling of ΔS, statistical regression,  or machine learning techniques. While machine learning methods have been assessed superior, they suffer from issues such as a lack of explainability, failure to identify causal drivers of TWS change, and use of short time series for feature extraction and training leading to poor or no representation of decadal natural variability.

Furthermore, in all the studies till now, representation of local human activities, such as ground water extraction or reservoir operation,  was either absent or assumed to be a linear trend. Here we revisit a reconstruction method by Humphrey et al., 2017 and show that these approximations have a considerable impact on the quality of reconstruction. Then we propose a multivariate regression model that relates selected hydrometeorological variables with TWS. These variables are identified from causal analysis of JULES model outputs. We show that temperature has a very weak relation with TWS compared to precipitation. The causal inference based model is able to capture realistic variability in reconstructed TWS. Our TWS reconstruction for the Ganges basin outperforms the contemporary attempts and is able to identify the drivers for interannual changes in TWS . The results bring historical perspective to the current state of water resources in the basin and provide context for design of future water resources policy.

How to cite: Yadav, V. K. and Vishwakarma, B. D.: Terrestrial Water Storage Reconstruction: A Causal Inference Approach, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-971, https://doi.org/10.5194/egusphere-egu24-971, 2024.

X3.28
|
EGU24-4191
|
ITS1.11/NP4.2
|
ECS
Wen Zhuo, Shibo Fang, Xinran Gao, Ricardo B. Lourenco, Yanru Yu, Jiahao Han, and Alemu Gonsamo

Soil moisture is undoubtably a vital variable of the climate system. Understanding the interactions among atmosphere, climate, and soil is necessary for water resource management, drought monitoring, and disaster prevention. However, evaluation of those interactions so far primarily focused on typical correlation analysis which often fail to imply causal relationship due to autocorrelation and high dimensionality within time series variables. Here, we used a data driven causal inference method called PCMCI+ to discover causal relationships among teleconnection patterns (El Niño Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD)), climate variables (precipitation and temperature) and soil moisture during 1980-2022 over Great Horn of Africa (GHOA), where is a susceptible region to suffer from severe drought. Further, we quantitative calculated the causal effects of teleconnection patterns on SM through different climate paths. Results suggest that IOD generally presents higher causal effects on climate variables (temperature and precipitation) or on soil moisture through both precipitation and temperature paths than ENSO over most parts of GHOA. Moreover, precipitation performs shorter lag effect and greater causal effect on soil moisture in GHOA. Our study provides the first attempt to quantitatively analyze the causal effects of teleconnection patterns on SM through both precipitation path and temperature path, and it highlights the causal relationships within atmosphere-climate-soil interactions, which could help for better understanding of climate change impact on drought over GHOA.

How to cite: Zhuo, W., Fang, S., Gao, X., Lourenco, R. B., Yu, Y., Han, J., and Gonsamo, A.: Causal effects of teleconnection patterns on soil moisture through different climate paths over the Greater Horn of Africa, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4191, https://doi.org/10.5194/egusphere-egu24-4191, 2024.

X3.29
|
EGU24-6584
|
ITS1.11/NP4.2
Katerina Schindlerova (Hlavackova-Schindler), Kejsi Hoxhallari, Luis Caumel Morales, Irene Schicker, and Claudia Plant

Using the era5 meteorological reanalysis data from 2000 to 2020 [1], we investigate temporal effects of ten wind related processes in time intervals of extreme wind speed values, extracted and corrected towards wind turbine locations for a wind farm in Andau, Austria.  We approach the problem by two ways, by the Granger causal inference, namely by the heterogeneous Graphical Granger model (HMML) [2] and by clustering, namely by the interactive k-means clustering (IKM) [3].

We investigate six scenarios based on the hydrological half-year, a moderate wind speed and time intervals of low or high extreme wind speed in the farm. In case of HMML, we discover causal variables and their values for each scenario.  Regarding the method IKM, it is used for three clusters (clusters for a moderate wind speed and for a low and high extreme wind speed) to find coefficient representations of each interacting variable with respect to the wind speed in each of the six scenarios.   We compare the results of both methods in terms of the values of causal variables and of the values of the coefficients of representation and evaluate the interpretability of the discovered causal connections with the expert meteorological knowledge.

 [1]  https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-pressure levels?tab=overview   

[2] Hlaváčková-Schindler, K., Plant, C. (2020) Heterogeneous graphical Granger causality by minimum message length, Entropy, 22(1400). pp. 1-21 ISSN 1099-4300 MDPI (2020).

[3] Plant, C., Zherdin, A., Sorg, C., Meyer-Baese, A., Wohlschläger, A. M. Mining interaction patterns among brain regions by clustering. IEEE Transactions on Knowledge and Data Engineering, 26(9):2237–2249, 2014.

How to cite: Schindlerova (Hlavackova-Schindler), K., Hoxhallari, K., Caumel Morales, L., Schicker, I., and Plant, C.: Causal discovery among wind-related variables in a wind farm under extreme wind speed scenarios: Comparison of results using Granger causality and interactive k-means clustering, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-6584, https://doi.org/10.5194/egusphere-egu24-6584, 2024.

X3.30
|
EGU24-7546
|
ITS1.11/NP4.2
Lisa Bock, Adrian McDonalds, Axel Lauer, and Jakob Runge

As a key component of the hydrological cycle and the Earth’s radiation budget, clouds play an important role in both weather and climate. Our incomplete understanding of clouds and their role in cloud-climate feedbacks leads to large uncertainties in climate simulations. Using causal discovery as an unsupervised machine learning method we aim to systematically analyse and quantify causal interdependencies and dynamical links between cloud properties and their controlling factors. This approach goes beyond correlation-based measures by systematically excluding common drivers and indirect links. By estimating the causal effect of each of the cloud controlling factors for different cloud regimes we expect to be able to better understand the dominant processes which determine the micro- and macro-physical properties of clouds.

Specifically, causal inference is used to investigate the links between cloud properties such as cloud cover, cloud water path, cloud top height and cloud radiative effects and so-called cloud controlling factors, i.e., quantities that impact cloud formation and temporal evolution of the cloud (e.g., sea surface temperature, water vapour path and lower tropospheric stability). For this, causal networks are calculated from time series of these variables from satellite and reanalysis datasets averaged over different geographical regions and cloud regimes in order to quantify the strength of the individual links in the resulting causal graph by applying causal effect estimation.

How to cite: Bock, L., McDonalds, A., Lauer, A., and Runge, J.: Quantifying the influence of cloud controlling factors with causal inference, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7546, https://doi.org/10.5194/egusphere-egu24-7546, 2024.

X3.31
|
EGU24-10448
|
ITS1.11/NP4.2
|
|
Myrthe Leijnse, Marc F.P. Bierkens, and Niko Wanders

Water scarcity is driven by diverse natural and anthropogenic factors and represents a critical global challenge. Structural Causal Models are powerful tools to reveal the intricate interactions among social, ecological and hydrological components within human-water systems affected by water scarcity. This study integrates causal thinking into statistical and data-driven hydrological modelling, offering a different perspective on understanding system dynamics affecting water resources in water-scarce regions, the so-called water scarcity hotspots.

In this work we apply causal discovery methods to independent timeseries of sectoral water demand, social-economic variables, meteorological drivers and groundwater depletion to obtain a causal network representing human-water system interactions at global water scarcity hotpots. To derive this network we use global datasets and advanced causal network learning algorithms, specifically (Joint-)PCMCI (Runge et al., 2023). Recognizing the importance of large data sample sizes for a robust global causal network, we further extend our approach to construct a causal network specific to one of the water scarcity hotspots (California), using more detailed local data. Therefore, our framework provides a comprehensive understanding of water scarcity dynamics including both global and local scales. Through a comparative analysis of network outcomes derived from global datasets with those specific to California, we evaluate the effectiveness of our causal inference modelling framework.

After conducting and evaluating the causal networks at global and local scale, we applied methods from structural causal modelling and statistical machine learning to estimate causal effects of anthropogenic or natural system changes on water availability at water scarcity hotspots. This framework allows us to answer important (counterfactual) questions, such as understanding how the rate of unsustainable groundwater abstraction is affected by shifts in water management practices e.g., a reduction in irrigated cropland area.

As such, this work contributes to understanding how using causal inference methods are valuable to modelling of water scarcity, ultimately providing input to informed decision-making in water resource management and finding strategies to mitigate water scarcity impacts.

Runge, J., Gerhardus, A., Varando, G., Eyring, V., & Camps-Valls, G. (2023). Causal inference for time series. Nature Reviews Earth & Environment4(7), 487-505.

How to cite: Leijnse, M., Bierkens, M. F. P., and Wanders, N.: Exploring Global and Local Water Scarcity Dynamics through Causal Discovery and Structural Causal Models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-10448, https://doi.org/10.5194/egusphere-egu24-10448, 2024.

X3.32
|
EGU24-11714
|
ITS1.11/NP4.2
|
ECS
Cas Decancq, Daniel Hagan, Victoria Deman, Akash Koppa, and Diego Miralles

Subseasonal prediction of heatwaves, although highly valuable for risk reduction, is challenging because heatwave onsets and propagation are complex processes with both fast and slow drivers from local to global scale. Traditionally, subseasonal forecasting relies heavily on dynamical model ensembles, which are complex and of high computational cost. As an alternative, machine learning provides potentially performant solutions that may match or even outperform these physical-based models. Transformers, in particular, are the current state-of-the-art deep learning infrastructures, and using multi-head-attention allows them to keep track of long-term complex dependencies in timeseries data. However, to better forecast heatwaves subseasonally, it is essential to move beyond purely predictor-to-target associative measures when identifying the sources of predictability. Such endeavours require causal frameworks that provide directionality and explainable power for the predictor-to-target relationships.

This study seeks to implement the PCMCI+ (Runge, 2020) framework to identify causal drivers of heatwaves on the Iberian Peninsula on a subseasonal scale. Causally-selected predictors are employed to forecast the occurence of heatwaves up to six weeks in advance using transformer networks, both for different seasons and sub-regions in the Iberian Peninsula. Preliminary results reveal heatwaves can be predicted with reasonable accuracy with a forecast window of six weeks, particularly in water limited regions, using causality-based machine learning.


Reference:

Runge, J. (2020). Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets. In Conference on Uncertainty in Artificial Intelligence, pages 1388–1397. PMLR.

How to cite: Decancq, C., Hagan, D., Deman, V., Koppa, A., and Miralles, D.: Subseasonal prediction of heatwaves in the Iberian Peninsula using causality-based transformer networks., EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11714, https://doi.org/10.5194/egusphere-egu24-11714, 2024.

X3.33
|
EGU24-12694
|
ITS1.11/NP4.2
|
ECS
Estimating the causal effect of temperature on ozone air pollution
(withdrawn)
Sebastian Hickman, Paul Griffiths, Peer Nowack, and Alex Archibald
X3.34
|
EGU24-13220
|
ITS1.11/NP4.2
|
François G. Schmitt

In 3D turbulence there is a flux of energy from large to small scales in the inertial range, associated with irreversibility, i.e. a breaking of the time reversal symmetry (Pumir, 2016). Such turbulent flows are characterized by scaling properties and we consider here how irreversibility depends on the scale. Two indicators of irreversibility for time series are tested involving triple correlations in a non-symmetric way. The first one proposed by Pomeau (1982, 2004) is: Po(r)=<X(t)X(t+r)X(t+3r)>-<X(t)X(t+2r)X(t+3r)>, where r is an increment and X(t) is the turbulent velocity which is stationary with zero mean. The second indicator has been proposed in the finance literature (Ramsey and Rothman, 1996), and was called symmetric bicovariance function: γ(r) = <X2(t)X(t+r)>-<X(t)X2(t+r)>. For time reversible processes, both indicators are zero, whereas their departure from 0 is an indicator of irreversibility.

We study these indicators applied to fully developed turbulence time series, from flume tank, wind tunnel and atmospheric turbulence databases. It is found that irreversibility occurs in the inertial range and has scaling properties with slopes close to one. A maximum value is found around the injection scale. This confirms that the irreversibility is associated with the turbulent cascade in the inertial range and shows that the irreversibility is maximal at the injection scale, the largest scale of the turbulent cascade.

This is published in Schmitt, F.G., Scaling analysis of time-reversal asymmetries in fully developed turbulence, Fractal and Fractional, 7(8), 630, 2023.  https://doi.org/10.3390/fractalfract7080630

Cited references: Pumir et al., Phys. Rev. Lett.. 116, 124502 (2016); Pomeau, J. de Physique 43, 859 (1982); Pomeau, Lect. Notes Phys. 644, 425 (2004); Ramsey and Rothman, J. Money Credit Bank. 28, 1 (1996).

How to cite: Schmitt, F. G.: Scaling properties of irreversibility indices in turbulence, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13220, https://doi.org/10.5194/egusphere-egu24-13220, 2024.

Posters virtual: Thu, 18 Apr, 14:00–15:45 | vHall X3

Display time: Thu, 18 Apr 08:30–Thu, 18 Apr 18:00
Chairpersons: Aditi Kathpalia, Fernando Iglesias-Suarez, Evgenia Galytska
vX3.1
|
EGU24-16028
|
ITS1.11/NP4.2
Yongxiang Huang and Francois Schmitt

When the wind blows over the ocean surface, it generates ripples which grow into wind-waves. These wind-waves can then propagate great distances as swell waves. As a result, measured ocean surface waves are often a combination of local wind-waves and swell waves from distant storms. While it's natural to assume that the wind is the cause and the waves are the effect (one-way causality), the presence of waves can actually modify the ocean surface roughness, altering the wind itself. This creates a feedback loop, introducing mutual causality between the wind and waves.

This study examines this interplay by analyzing wind and wave data simultaneously measured by the CFOSAT satellite. Using the Liang-Kleeman information flow test, a normalized causality index is calculated to quantify the cause-effect relationship. Analysis of a single CFOSAT track confirms the one-way causality assumption, with the wind driving the waves. Further analysis reveals the global distribution of the causality index. Interestingly, higher wind-wave causality (T21) is observed in mid-latitude regions (30°S to 30°N), while it weakens in higher latitudes. Additionally, the wave-wind index remains significantly less than 1, further supporting the dominance of one-way causality in most regions.

 

Ref.

 

Gao, Y. , Schmitt, F.G., Hu,  J.Y. &  Huang, Y.X. (2021) Scaling analysis of the China France Oceanography Satellite along-track wind and wave data. J. Geophys. Res. Oceans, 126:e2020JC017119

Liang,  X. San (2015) Normalizing the causality between time series. Phys. Rev. E 92, 022126.

Gao, Y. , Schmitt, F.G., Hu,  J.Y. &  Huang, Y.X. (2023) Probability-based wind-wave relation. Front. Mar. Sci., 9:1085340

How to cite: Huang, Y. and Schmitt, F.: Global distribution of Wind-Wave Causality Index, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16028, https://doi.org/10.5194/egusphere-egu24-16028, 2024.