NP5.2
Inverse problems, Predictability, and Uncertainty Quantification in Geosciences using data assimilation and its combination with machine learning

NP5.2

Inverse problems, Predictability, and Uncertainty Quantification in Geosciences using data assimilation and its combination with machine learning
Convener: Javier Amezcua | Co-conveners: Harrie-Jan Hendricks Franssen, Lars Nerger, Yvonne Ruckstuhl, Olivier Talagrand, Tijana Janjic, Natale Alberto Carrassi
Presentations
| Wed, 25 May, 15:10–18:30 (CEST)
 
Room 0.94/95

Presentations: Wed, 25 May | Room 0.94/95

Chairpersons: Javier Amezcua, Lars Nerger
15:10–15:12
Data assimilation and machine learning
15:12–15:22
|
EGU22-5692
|
ECS
|
solicited
|
Virtual presentation
Alban Farchi, Marc Bocquet, Patrick Laloyaux, Massimo Bonavita, Marcin Chrust, and Quentin Malartic

The idea of using machine learning (ML) methods to reconstruct the dynamics of a system is the topic of recent studies in the geosciences in which the key output is a surrogate model meant to emulate the dynamical model. In order to treat sparse and noisy observations in a rigorous way, ML can be combined with data assimilation (DA). This yields a class of iterative methods in which, at each iteration a DA step estimates the system's state, and alternates with a ML step to learn the system's dynamics from the DA analysis.

This framework can be used to correct the error of an existent, physical model. The resulting surrogate model is hybrid, with a physical and a statistical part. In practice, the correction can be added as an integrated term (i.e. in the model resolvent) or directly inside the tendencies of the physical model. The resolvent correction is easy to implement but is not suited for short-term predictions. The tendency correction is more technical since it requires the adjoint of the physical model, but also more flexible and can be used for any forecast lead time.

In this presentation, we start by a proof of concept for the use of joint DA and ML tools to correct model error. We use the resolvent correction with simple neural networks to correct the error of a two-dimensional, two layer quasi-geostrophic layer. The difference between the resolvent and the tendency correction is then illustrated with the two- scale Lorenz model. Finally, we show that the tendency correction opens the possibility to make online model error correction, i.e. improving the model progressively as new observations become available. We compare online and offline learning using the same twin experiment with the two-scale Lorenz model.

 

How to cite: Farchi, A., Bocquet, M., Laloyaux, P., Bonavita, M., Chrust, M., and Malartic, Q.: Model error correction with data assimilation and machine learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5692, https://doi.org/10.5194/egusphere-egu22-5692, 2022.

15:22–15:28
|
EGU22-2838
|
ECS
|
Virtual presentation
|
Hristo G. Chipilski

Recent years have seen active efforts within the geophysical community to combine traditional Data Assimilation (DA) methods with emerging Machine Learning (ML) techniques. However, most of this past theoretical work has been centered on variational DA approaches due to their similarity with ML in terms of how the underlying optimization problem is formulated and solved. Here I will present a new and completely general nonlinear estimation theory that retains the flexibility of advanced sampling-based methods (e.g., the particle filter) and the analytical tractability of linear estimation algorithms (e.g., the ensemble Kalman filter). In particular, an alternative state space model will be constructed whose filtering and smoothing distributions remain closed under a wide class of nonlinear functions. Since these nonlinear functions are only required to be bijective and continuously differentiable, the new estimation theory serves an ideal framework for rigorously incorporating invertible neural networks in the DA design. There are two additional properties which make the proposed framework especially appealing. First, linear estimation results follow immediately upon substituting the invertible neural networks with the identity transformation. Second, the prior and posterior belong to the same distribution family, which implies that the correlation structure and the corresponding dynamical balances in the model state are preserved following the analysis step. During the upcoming EGU meeting, I will discuss the motivation behind the new estimation framework, place it in the context of existing nonlinear DA techniques and demonstrate some of its benefits through idealized numerical examples.

How to cite: Chipilski, H. G.: Bridging linear state estimation and machine learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2838, https://doi.org/10.5194/egusphere-egu22-2838, 2022.

15:28–15:34
|
EGU22-1046
|
Virtual presentation
Empirical Mode Modeling: A data-driven approach to recover and forecast nonlinear dynamics from noisy data
(withdrawn)
Joseph Park, Gerald Pao, Erik Stabenau, George Sugihara, and Thomas Lorimer
15:34–15:40
|
EGU22-12361
|
ECS
|
On-site presentation
Linus Walter, Francesco Parisio, and Víctor Vilarrasa

Geoenergies such as underground gas storage and energy storage, geothermal energy and geologic carbon storage are key technologies on the way to the foreseeable energy transition. The reservoir characterization in these projects remains challenging since predictive modeling approaches face limitations in identifying the spatial distribution of distinct lithologies and their hydro-mechanical properties from downwell testing procedures. Pumping tests are usually carried out to infer permeability, but offer only few observation points in space and require large extrapolation through inversion. The application of Physics Informed Neural Networks (PINN) offers a promising solution which can seamlessly incorporate field data, while enforcing the accordance with physical laws in the domain of study. This concept is implemented via two distinct loss terms for both the physical constraints and for the observational data in the loss function of an Artificial Neural Network (ANN). The physics-informed loss term contains a mass balance equation consisting of a storage and a diffusion component. The process is considered to be purely hydraulic and Darcy flow is assumed. The observational loss term compares the output of the ANN to a set of training data. This set consists of  the system’s initial fluid pressure, as well as of a fluid pressure time series at the domain boundaries and at the borehole location. Preliminary results suggest that our PINN model is able to forecast the spatiotemporal fluid pressure distribution in a 2D domain for a variety of pumping test schemes. In this way, we give a first impression of the opportunities that PINN applications offer in the field of reservoir modeling.

How to cite: Walter, L., Parisio, F., and Vilarrasa, V.: Prediction of Subsurface Fluid Flow via Physics Informed Neural Networks, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12361, https://doi.org/10.5194/egusphere-egu22-12361, 2022.

Methodology
15:40–15:46
|
EGU22-3065
|
ECS
|
Virtual presentation
|
Eviatar Bach and Michael Ghil

Data assimilation (DA) aims to optimally combine model forecasts and noisy observations. Multi-model DA generalizes the variational or Bayesian formulation of the Kalman filter, and we prove here that it is also the minimum variance linear unbiased estimator. However, previous implementations of this approach have not estimated the model error, and have therewith not been able to correctly weight the separate models and the observations. Here, we show how multiple models can be combined for both forecasting and DA by using an ensemble Kalman filter with adaptive model error estimation. This methodology is applied to the Lorenz-96 model and it results in significant error reductions compared to the best model and to an unweighted multi-model ensemble.

How to cite: Bach, E. and Ghil, M.: A multi-model ensemble Kalman filter for forecasting and data assimilation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-3065, https://doi.org/10.5194/egusphere-egu22-3065, 2022.

15:46–15:52
|
EGU22-1218
|
ECS
|
Virtual presentation
Colin Grudzien and Marc Bocquet

Ensemble-variational methods form the basis of the state-of-the-art for nonlinear, scalable data assimilation, yet current designs may not be cost-effective for reducing prediction error in online, short-range forecast systems. We propose a novel, outer-loop optimization of the Bayesian maximum a posteriori formalism for ensemble-variational smoothing in applications for which the forecast error dynamics are weakly nonlinear, such as synoptic meteorology. In addition to providing a rigorous mathematical derivation our technique, we systematically develop and inter-compare a variety of ensemble-variational schemes in the Lorenz-96 model using the open-source Julia package DataAssimilationBenchmarks.jl. This high-performance numerical framework, supporting our mathematical results, produces extensive benchmarks that demonstrate the significant performance advantages of our proposed technique versus several similar estimator designs. In particular, our single-iteration ensemble Kalman smoother (SIEnKS) is shown both to improve prediction / posterior accuracy and to simultaneously reduce the leading order cost of iterative, sequential smoothers in a variety of relevant test cases for operational short-range forecasts.  These results are currently in open review in Geoscientific Model Development (Preprint gmd-2021-306) and the Journal of Open Source Software (Preprint #3976).

How to cite: Grudzien, C. and Bocquet, M.: A fast, single-iteration ensemble Kalman smoother for sequential data assimilation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1218, https://doi.org/10.5194/egusphere-egu22-1218, 2022.

15:52–15:58
|
EGU22-2604
|
Virtual presentation
Gilles Tissot, Etienne Mémin, and Bérenger Hug

This study aims at proposing a new framework to perform ensemble-based estimations of dynamical trajectories of a geophysical fluid flow system. To perform efficient estimations, the ensemble members are embedded in a set of evolving reproducing kernel Hilbert spaces (RKHS) defining a manifold of spaces, we nicknamed Wonderland, due to its analytical properties.

The method proposed here is designed to deal with very large scale systems such as oceanic or meteorological flows, where it is out of the question to explore the whole attractor, neither to run very long time simulations. Instead, we propose to learn the system locally, in phase space, from an ensemble of trajectories.

The novelty of the present work relies on the fact that the feature maps between the native space and the RKHS manifold are transported by the dynamical system. This creates, at any time, an isometry between the tangent RKHS at time t and the initial conditions. This has several important consequences. First, the kernel evaluations are constant along trajectories, instead to be attached to a system state. By doing so, a new ensemble member embedded in the RKHS manifold at the initial time can be very simply estimated at a further time. This framework displays striking properties. The Koopman and Perron-Frobenius operators on such RKHS manifold are unitary, even though the system might be non invertible. They are furthermore uniformly continuous (with bounded generators) and diagonalizable. As such they can be rigourously expended in exponential forms. 

This set of analytical properties enables us to provide a practical estimation of the Koopman eigenfunctions. In the proposed strategy, evaluations of these Koopman eigenfunctions at the ensemble members are exact. To perform robust estimations, the finite-time Lyapunov exponents associated with each Koopman eigenfunction (which are easily accessible on the RKHS manifold as well) are determined. On this basis, we are able to filter the kernel by removing contributions of the Koopman modes that exceed the predictability time. We show that it leads to robust estimations of new unknown trajectories. This framework allows us to write an ensemble-based data assimilation problem, where constant-in-time linear combinations coefficients between ensemble members are sought in order to estimate the QG flow based on noisy swath observations.

The methodology is demonstrated on a barotropic quasi-geostrophic model of a double gyres. After comparing various kernels and provided guidelines to adapt the kernel with the spread of the ensemble, we show isometry and Koopman-filtered reconstructions. Finally, the data assimilation is presented.

How to cite: Tissot, G., Mémin, E., and Hug, B.: Koopman eigenfunctions estimation from reproducing kernel Hilbert space manifold, and ensemble data assimilation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2604, https://doi.org/10.5194/egusphere-egu22-2604, 2022.

15:58–16:04
|
EGU22-1497
|
ECS
|
Virtual presentation
|
Nora Schenk, Anne Walter, and Roland Potthast

Nonlinear ensemble data assimilation methods like particle filters aim to improve the numerical weather prediction and the uncertainty quantification in a non-Gaussian setting. The localized adaptive particle filter (LAPF), introduced by R. Potthast, A. Walter and A. Rhodin in 2019, overcomes filter collapse in a high-dimensional framework. This particle filter was further developed by Walter et al. (2021) to the local mixture coefficients particle filter (LMCPF) which was tested within the global ICON model. In the LMCPF method the background distribution is approximated by Gaussian mixtures. After a classical resampling step, Bayes' formula is carried out explicitly under the assumption of a Gaussian distributed observation error. Furthermore, the particle uncertainty can be adjusted which affects the strength of the shift of the particles toward the observation. Lastly, Gaussian resampling is employed to increase the ensemble variability. All steps are carried out in ensemble space and observation localization is applied in the method.

Following a study of Kotsuki et al. (2021), we recently substituted the approximated particle weights in the LMCPF method with the exact Gaussian mixture weights which leads to an increase of the effective ensemble. Using the exact weights, Kotsuki et al. (2021) detected an improvement of  the stability of the LMCPF method with respect to the inflation parameters within the SPEEDY model.

Furthermore, we explore the potential of the LMCPF with the exact particle weights in the kilometre-scale ensemble data assimilation (KENDA) system with the limited area mode of the ICON model (ICON-LAM) and compare the particle filter method to the localized ensemble transform Kalman filter (LETKF) which is operationally used at the German Meteorological Service (DWD). Both methods describe four-dimensional data assimilation schemes if the observation operators are applied during the model forward integration at the exact observation times and not only at analysis time. This leads to four-dimensional background error covariance matrices at times and locations of the observations which are employed to derive the analysis ensemble.

In addition to a mathematical introduction of the LMCPF method, we present experimental results for the LMCPF in comparison with the LETKF method in KENDA used at DWD for the ICON-LAM model. Moreover, we discover the improvements of the LMCPF with exact particle weights over the method with approximated weights.

How to cite: Schenk, N., Walter, A., and Potthast, R.: A 4D-Localized Particle Filter Method for Regional Data Assimilation at DWD, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1497, https://doi.org/10.5194/egusphere-egu22-1497, 2022.

Error estimation
16:04–16:10
|
EGU22-336
|
ECS
|
Virtual presentation
Mayeul Destouches, Paul Mycek, Jérémy Briant, Selime Gürol, Anthony Weaver, Serge Gratton, and Ehouarn Simon

In ensemble variational (EnVar) data assimilation systems, background error covariances are sampled from an ensemble of forecasts evolving with time. One possible way of generating this ensemble is by running an Ensemble of Data Assimilations (EDA) that samples all possible error sources (initial condition errors, boundary condition errors, model errors). Large ensemble sizes are desirable to minimize sampling errors, but generating a single ensemble member is usually expensive due to the cost of integrating the physical model. In practice, ensembles with coarser spatial resolutions are sometimes used, allowing for cheaper generation of individual members, and thus larger ensemble sizes.

Multilevel Monte Carlo (MLMC) methods propose to go beyond this usual trade-off between grid resolution and ensemble size, by expressing a fine-grid estimator as an astute combination of estimators computed on a hierarchy of spatial grids. Starting from a Monte Carlo covariance estimator on a coarse grid but with a large ensemble size, correction terms are added to form a quasi-telescopic sum. The correction terms come from EDAs of increasing spatial resolutions and decreasing ensemble sizes, with a pairwise stochastic coupling between EDAs of two successive resolutions. The expectation of this MLMC estimator is equal to the expectation of the Monte Carlo estimator on the finest grid, so that no bias is introduced by the coarse resolution forecasts. Without increasing the computational cost, MLMC effectively reduces the variance of the covariance estimator, i.e. reduces the sampling noise on covariances.

We first present the theoretical basis of MLMC and how it can apply to the estimation of covariance matrices. An illustration with a quasi-geostrophic model is then presented. For a given computational budget, we compare three equal-cost methods to estimate background error covariances: (1) the usual single-resolution ensemble estimate, (2) a combination of estimates of various resolutions based on Bayesian Model Averaging and (3) the MLMC estimate. The methods are compared in terms of mean square error of the covariance estimators, and in terms of quality of the resulting analyses for one assimilation cycle. The role of covariance localization in each case is also briefly discussed.

This work is partially supported by 3IA Artificial and Natural Intelligence Toulouse Institute, French "Investing for the Future- PIA3" program under the Grant agreement ANR-19-PI3A-0004.

This project has received financial support from the CNRS through the 80Prime program.

How to cite: Destouches, M., Mycek, P., Briant, J., Gürol, S., Weaver, A., Gratton, S., and Simon, E.: Multilevel Monte Carlo estimation of background error covariances in ensemble variational data assimilation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-336, https://doi.org/10.5194/egusphere-egu22-336, 2022.

16:10–16:16
|
EGU22-8918
|
Virtual presentation
Ricardo Todling, Noureddine Semane, Rick Anthes, and Sean Healy

This note examines the relationship between what at first sight looks like two unrelated methods for estimating second order statistics of relevance to data assimilation. The first method is due to Desroziers et al. (2005) and relies on residual statistics readily available from data assimilation applications. The second method, due to Gray and Allan (1974), only recently making its appearance in atmospheric sciences, is generally formulated to use three data sets and seems in principle capable of deriving estimates of observation, background and analysis just as well. The usefulness of either method lies in them not requiring knowledge of the true value of the quantities at play. Desroziers derives its results by relying explicitly on the constraints associated with the data assimilation minimization problem; the 3CH method is general and its estimates hold as long as random errors in the three data sets of choice are independent. Establishing the relationship between the methods amounts to identifying the data sets of 3CH with be the observation, background, and analysis associated with Desroziers. The choice of observation and background for two of the data sets of 3CH is acceptable under the typical assumption of independence in their errors. Specifying the third data set of 3CH as the analysis seems unreasonable for analysis errors are by construction dependent on errors in both observations and background. This note finds that when the assumption of optimality required of Desroziers is applied to 3CH the latter method recovers the Desroziers error estimates for observation and background. More interestingly, in contrast with Desroziers estimate of errors in the analysis, the remaining corner of 3CH obtains the negative of the analysis error variance. An illustration of this finding is provided by deriving various uncertainties in bending angle.

 

How to cite: Todling, R., Semane, N., Anthes, R., and Healy, S.: The Relationship between Desroziers and Three-Cornered Hat Methods, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8918, https://doi.org/10.5194/egusphere-egu22-8918, 2022.

16:16–16:22
|
EGU22-7448
|
ECS
|
Presentation form not yet defined
Anne Pein and Peter Jan Van Leeuwen
Weak-constraint 4DVar (WC-4DVar) not only takes errors in initial conditions into account but also assumes that the physical model itself is erroneous. As model errors, arising e.g. from unresolved processes, can be substantial in geoscience applications, the weak-constraint formulation yields more accurate results compared to its strong-constraint counterpart. Furthermore, accuracy in forecasting should be improved since the algorithm produces an optimal solution at the end of the assimilation window, instead of revised initial conditions. Finally, WC-4DVar allows for longer assimilation windows because of reduced sensitivity to initial conditions. 
 
However, for complex high-dimensional models, it is not simple to estimate the model error covariances, as needed in the WC-4DVar algorithm. A promising approach to address this challenge might look as follows: We start with a first-guess model error covariance, e.g. a scaled-down (in amplitude and length-scale) initial state (background) covariance (the so-called B-matrix) with added time correlation, and perform a WC-4DVar assimilation step. This yields, besides an optimised solution at the end of the assimilation window, estimates for the model errors. We then use these model errors to derive new model error covariances with which we perform the next assimilation step. This procedure is iterated.
 
In this talk, we present initial results of this approach applied to the Burgers’ equation and using an observation-space WC-4DVar algorithm (sometimes called PSAS). We outline the procedure, demonstrate its feasibility, and discuss extensions to real-world systems.

How to cite: Pein, A. and Van Leeuwen, P. J.: Model error covariance estimation in observation space weak-constraint 4DVar, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7448, https://doi.org/10.5194/egusphere-egu22-7448, 2022.

Dynamics
16:22–16:28
|
EGU22-1164
|
ECS
|
On-site presentation
Giovanni Conti, Ali Aydoğdu, Silvio Gualdi, Antonio Navarra, and Joe Tribbia

In this work we show how it is possible to derive a new set of nudging equations, a tool still used in many data assimilation problems, starting from statistical physics considerations and availing ourselves of stochastic parameterizations that take into account unresolved interactions. The fluctuations used are thought of as Gaussian white noise with zero mean. The derivation is based on the conditioned Langevin dynamics technique. Exploiting the relation between the Fokker–Planck and the Langevin equations, the nudging equations are derived for a maximally observed system that converges towards the observations in finite time. The new nudging term found is the analog of the so called quantum potential of the Bohmian mechanics. In order to make the new nudging equations feasible for practical computations, two approximations are developed and used as bases from which extending this tool to non-perfectly observed systems. By means of a physical framework, in the zero noise limit, all the physical nudging parameters are fixed by the model under study and there is no need to tune other free ad-hoc variables. The limit of zero noise shows that also for the classical nudging equations it is necessary to use dynamical information to correct the typical relaxation term. A comparison of these approximations with a 3DVar scheme, that use a conjugate gradient minimization, is then shown in a series of four twin experiments that exploit low order chaotic models.

How to cite: Conti, G., Aydoğdu, A., Gualdi, S., Navarra, A., and Tribbia, J.: On the physical nudging equations, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1164, https://doi.org/10.5194/egusphere-egu22-1164, 2022.

16:28–16:34
|
EGU22-8659
|
Presentation form not yet defined
Dan Crisan and Michael Ghil

Extensive numerical evidence for real and/or simulated data shows that the assimilation of observations has a stabilizing effect on unstable dynamics in numerical weather prediction and elsewhere.  In this talk, I will discuss mathematically rigorous considerations showing why this is so. In particular we prove that the expected value of the Wasserstein distance between the forecast-assimilation (FA) process starting from the true initial conditions and FA process wrongly initialized can be controlled uniformly in time. Under suitable circumstances, the number of observations required to achieve this stabilization can be much smaller than the number of model variables. In particular, it suffices to observe the model's unstable degrees of freedom. 

How to cite: Crisan, D. and Ghil, M.: Asymptotic behavior of the forecast-assimilation process with unstable dynamics, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8659, https://doi.org/10.5194/egusphere-egu22-8659, 2022.

16:34–16:40
|
EGU22-1664
|
ECS
|
Virtual presentation
Yumeng Chen, Alberto Carrassi, and Valerio Lucarini

Data assimilation (DA) aims at optimally merging observational data and model outputs to create a coherent statistical and dynamical picture of the system under investigation. Indeed, DA aims at minimizing the effect of observational and model error and at distilling the correct ingredients of its dynamics. DA is of critical importance for the analysis of systems featuring sensitive dependence on the initial conditions, as chaos wins over any finitely accurate knowledge of the state of the system, even in absence of model error. Clearly, the skill of DA is guided by the properties of dynamical system under investigation, as merging optimally observational data and model outputs is harder when strong instabilities are present. In this paper we reverse the usual angle on the problem and show that it is indeed possible to use the skill of DA to infer some basic properties of the tangent space of the system, which may be hard to compute in very high-dimensional systems. Here, we focus our attention on the first Lyapunov exponent and the Kolmogorov–Sinai entropy and perform numerical experiments on the Vissio–Lucarini 2020 model, a recently proposed generalization of the Lorenz 1996 model that is able to describe in a simple yet meaningful way the interplay between dynamical and thermodynamical variables.

How to cite: Chen, Y., Carrassi, A., and Lucarini, V.: Inferring the instability of a dynamical system from the skill of data assimilation exercises, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1664, https://doi.org/10.5194/egusphere-egu22-1664, 2022.

Coffee break
Chairpersons: Javier Amezcua, Lars Nerger
Coupled data assimilation
17:00–17:10
|
EGU22-4845
|
ECS
|
solicited
|
Virtual presentation
|
Tobias Finn, Gernot Geppert, and Felix Ament

The sensible heat flux and evapotranspiration couple the atmospheric boundary layer and the land surface together. It was shown that screen-level observations like the 2-metre-temperature contain information about land surface parameters such as the soil moisture. As model biases and parametrizations normally causes problems in operational land surface data assimilation, such screen-level observations are assimilated into the soil moisture with simplified data assimilation methods. Here, we will take another point of view onto the problem and show a potential of advanced ensemble data assimilation methods.

We ask what would happen, if we would have a perfect model and favorable conditions. With the limited-area TerrSysMP modelling framework, in a COSMO-CLM configuration, we perform idealized twin experiments for a seven-day period, where all differences between runs are only due to initial soil conditions or data assimilation. We assimilate sparsely-distributed and synthetic 2-metre-temperature observations from a nature run into the soil moisture. In these idealized experiments, we are able to prove that a localized ensemble transform Kalman filter, as similarly used for operational data assimilation in the mesoscale, can directly assimilate hourly instantaneous screen-level observations without the need of an additional optimal interpolation step. Here, we improve the soil moisture analysis by up to 50% compared to our open-loop run without data assimilation. Furthermore, taking temporal dependencies within a 24-hour window during the correction step into account and using a 4DEnVar-like localized ensemble Kalman smoother improves the analysis by a further 10%.

The approximation of the vertical covariances by the ensemble can nevertheless induce an overconfidence of the analysis, especially in ensemble smoothers where more observations are assimilated at once. Then, the potential of the observations cannot be fully used. An idea to circumvent such problems is to assimilate observational features instead of the raw observations to make the data assimilation problem simpler. We can explicitly construct such features by making use of characteristic fingerprints within the observations that point towards errors within the variable of interest; we term them fingerprint operators. Here, we will show two fingerprint operators for the 2-metre-temperature: the averaged temperature between 6 UTC and 18 UTC and the amplitude of a sine curve, fitted to 2-metre-temperature observations in a 24-hour window. These fingerprints represent that the soil moisture influences the daytime temperature and the diurnal cycle of the 2-metre-temperature. With these features, we retain useful information about the soil moisture and obtain similar results to the localized ensemble Kalman smoother. As our idealized experiments have by construction favorable conditions for ensemble Kalman smoothers, these results indicate a potential for fingerprint operators in coupled data assimilation across the atmosphere-land interface.

How to cite: Finn, T., Geppert, G., and Ament, F.: Ensemble data assimilation of screen-level observations across the atmosphere-land interface enhanced by fingerprint operators, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4845, https://doi.org/10.5194/egusphere-egu22-4845, 2022.

17:10–17:16
|
EGU22-2698
|
Virtual presentation
Shastri Paturi, Alexandra Bozec, Eric Chassignet, Zulema Garraffo, Avichal Mehra, and Daryl Kleist

The purpose of employing data assimilation methods in operational ocean forecasting systems is to provide good initialization to the models and is dependent on good quality ocean observations being assimilated. Using or accepting erroneous data can result in an inaccurate analysis and alternatively, rejecting extreme or valid data can result in missing important events.

In this study two ocean-sea ice coupled systems are considered: HYCOM-CICE4 and MOM6-CICE6 at ¼-deg horizontal resolution and 41 vertical layers. The two coupled models are initialized from the World Ocean Atlas 2018 (WOA) temperature and salinity climatology for a period of 20 years. Both models are forced with GEFS (Global Ensemble Forecast System created by the National Centers for Environmental Prediction: NCEP). The data assimilation is performed on a 24-hr cycle using RTOFS-DA (Real-time Ocean Forecasting system-DA; 3DVAR) for HYCOM-CICE4 and SOCA (Sea ice Ocean Coupled Assimilation; 3DVAR) for MOM6-CICE6 to compare the data Quality Control (QC) methods. The ocean data being assimilated include satellite sea surface temperature (SST) and sea surface salinity (SSS), in-situ temperature & salinity, absolute dynamic topography (ADT), sea ice concentration.

The QC in RTOFS-DA and SOCA are fully automated and are performed through various filters applied (e.g., land-sea area fraction to eliminate satellite data near the coast, temperature inversion elimination in in-situ profile data, etc). The various QC methods in both DA systems are described. The results of the analysis and 24-forecast are compared against independent observations and statistics of the data accepted and rejected between the two DA systems are presented and discussed.

How to cite: Paturi, S., Bozec, A., Chassignet, E., Garraffo, Z., Mehra, A., and Kleist, D.: Quality Control Methods in Ocean-Sea ice Coupled Data Assimilation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2698, https://doi.org/10.5194/egusphere-egu22-2698, 2022.

17:16–17:22
|
EGU22-3616
|
On-site presentation
|
Iuliia Polkova, Guokun Lyu, Detlef Stammer, Silke Schubert, Frank Lunkeit, and Armin Köhl

The need for reliable climate predictions is growing in demand for various socio-economic sectors. The predictability studies though show that Earth System Models (ESMs) can predict important climate variables, the predictions suffer from model errors and initialization shocks that limit predictability. The magnitude of this effect is difficult to quantify unless one could perform experiments in dynamically- and model-consistent settings to contrast against each other various sources of initialization shocks. This is the idea of our study, which concerns quantifying and understanding the impact of deriving dynamically balanced initial conditions on the decadal prediction skill.

Among a variety of coupled data assimilation (CDA) methods, the adjoint method is one of the promising because its result is dynamically consistent with the ESM equations; however, the method might also be one of the most demanding to design and maintain. Here, we use the coupled adjoint model developed for the ESM of intermediate complexity CESAM (Centrum für Erdsystemforschung und Nachhaltigkeit Erdsystem Assimilations-Modell) to produce a coupled ocean-atmosphere reanalysis. So far, we prepared and tested the forward and adjoint CESAM for upcoming decadal climate prediction experiments. We present the performance of the forward CESAM in terms of the 20th-century historical simulations, which are typically used as the benchmark for comparing initialized versus uninitiated climate simulations. We also present the setup for the adjoint CESAM as well as the initial CDA experiments. In the following, these CDA experiments will serve as a source of initial conditions for ensembles of retrospective decadal predictions. In a model-consistent approach, the study will compare initialization based on the coupled ocean-atmosphere reanalysis and based on the widespread strategy in decadal prediction studies of nudging toward ocean and atmosphere reanalyses, which are usually external to a prediction system as well as they are un-coupled. Results of this study aim to guideline future initialization developments for comprehensive ESMs.

How to cite: Polkova, I., Lyu, G., Stammer, D., Schubert, S., Lunkeit, F., and Köhl, A.: A design of the optimal setup for a coupled data assimilation for decadal climate predictions, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-3616, https://doi.org/10.5194/egusphere-egu22-3616, 2022.

17:22–17:28
|
EGU22-7395
|
ECS
|
On-site presentation
|
Bastian Waldowski, Insa Neuweiler, and Emilio Sánchez-León

We test the improvement of flux predictions with data assimilation (DA) in a coupled land surface/subsurface model. We present results of DA experiments in an idealized testcase with an extent of 1km x 5km x 50m. Our model considers multiple heterogeneous soil units, different plant functional types and a sophisticated topographical design chosen to induce lateral flow and rivers at specific areas. We use TSMP-PDAF to couple the land-surface model CLM and the subsurface/surface flow model ParFlow with the DA framework PDAF. We use a Localized Ensemble Kalman Filter (LEnKF) with an ensemble of 93 members. We consider uncertainty in the atmosphere, soil properties and initial conditions by different atmospheric forcings, distinct heterogeneous soil parameter distributions and an individual spinup for each ensemble member. The ensemble, which has a horizontal grid resolution of 40m, is updated with virtual measurements from a high resolution (10m) reference model.
In the scope of this work, we address the impact of updating different state variables (soil moisture and pressure head) on groundwater recharge, lateral subsurface flow, surface runoff, and evapotranspiration. While surface runoff and evapotranspiration directly depend on pressure head and soil moisture, subsurface flow depends on pressure head gradients. For groundwater recharge, our estimate depends on groundwater storage changes (which can directly be enforced by the updates during DA) as well as subsurface flow. To investigate if DA can directly improve these fluxes, we run multiple experiments with different observation frequencies and localization radii. Further, we investigate if there are improvements in the fluxes during open loop forecasting periods subsequent to DA.

How to cite: Waldowski, B., Neuweiler, I., and Sánchez-León, E.: Effects of data assimilation on different fluxes of a fully coupled land surface/subsurface model, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7395, https://doi.org/10.5194/egusphere-egu22-7395, 2022.

17:28–17:34
|
EGU22-8869
|
ECS
|
Virtual presentation
|
Tabea Gleiter, Tijana Janjic, and Nan Chen

The Madden-Julian oscillation (MJO) is the dominant component of tropical intraseasonal variability with wide reaching impacts even on extratropical weather and climate patterns. However, predicting the MJO is challenging. One reason are suboptimal state estimates obtained with standard data assimilation (DA) approaches. Those are typically based on filtering methods with Gaussian approximations and do not consider physical properties that are specifically important for the MJO.

In our recent paper (Gleiter et al. 2022), a constrained ensemble DA method is applied to study the impact of different physical constraints on the state estimation and prediction of the MJO with the Skeleton model. The utilized quadratic programming ensemble (QPEns) algorithm extends the standard stochastic ensemble Kalman filter (EnKF) with specifiable constraints on the updates of all ensemble members. This allows to recover physically more consistent states and to respect possible associated non-Gaussian statistics. Our results demonstrate an overall improvement in the filtering and forecast skill when the model's total energy is conserved in the initial condition. The degree of benefit is found to be dependent on the observational setup and the strength of the model's nonlinear dynamics. It is also shown that even in cases where the statistical error in some waves remains comparable to the stochastic EnKF during the DA stage, their prediction is remarkably improved when using the initial state resulting from the QPEns.

Gleiter, T., T. Janjic, N. Chen, 2022, Ensemble Kalman Filter based Data Assimilation for Tropical Waves in the MJO Skeleton Model, QJR Meteorol Soc., https://doi.org/10.1002/qj.4245

How to cite: Gleiter, T., Janjic, T., and Chen, N.: Ensemble Kalman Filter based Data Assimilation for Tropical Waves in the MJO Skeleton Model, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8869, https://doi.org/10.5194/egusphere-egu22-8869, 2022.

Novel applications
17:34–17:40
|
EGU22-12279
|
Highlight
|
Presentation form not yet defined
|
Arundhuti Banerjee, Ylona van Dinther, and Femke Vossepoel

Forecasting earthquake occurrence is a challenging endeavor, which will ultimately require a combination of observations and physics-based models. Data assimilation may help to combine these and their uncertainties in a statistically solid manner. To understand the potential of ensemble data assimilation, we investigate whether the fault stress state can be estimated and forecasted in the presence of a bias in a friction parameter. In a perfect model test, we introduce different degrees of bias in rate-and-state parameter b. b describes the evolution of frictional strength with fault slip velocity and thus impacts earthquake slip and the subsequent recurrence interval. Our forward model is a simplified, zero-dimensional (0D) Burridge-Knopoff spring-block system with a rate- and state-dependent friction formulation using a ‘slip law’. We assimilate synthetic observations of fault shear stress and slip rate variables and corresponding large uncertainties. We compare state estimation with joint state-parameter estimation using a sequential importance resampling particle filter by evaluating the quality of the estimated fault stress probability density functions (pdf’s).

The results of the study indicate that state estimation works well for systems with low (3%) to intermediate (15%) bias. This performance for the case of intermediate bias can be improved through increasing model error combined with double resampling in the particle filter. For a large friction-parameter bias (42 %), we show that state-parameter estimation is the only way to correct the bias. This is an important result, because it shows that state-parameter estimation is able to identify trade-offs and separate error contributions coming from stress state and friction parameters.  Furthermore, the results of this study can be applied to other data assimilation applications involving models that are particularly vulnerable to parameter biases.

How to cite: Banerjee, A., van Dinther, Y., and Vossepoel, F.: Estimating states and parameters in earthquake sequence models in the presence of a parameter bias, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12279, https://doi.org/10.5194/egusphere-egu22-12279, 2022.

17:40–17:46
|
EGU22-8578
|
ECS
|
On-site presentation
Pierre Le Bras, Pierre Ailliot, Noémie Le Carrer, Juan Ruiz, Florian Sévellec, and Pierre Tandeo

The multi-model ensemble approach is applied in geosciences to provide better predictions or projections, by weighting the outputs from different dynamical models. Basically, the weighting procedure relies on the choice of a performance metric to measure the closeness of individual model outputs to actual observations. The highest weight is then given to the model that best matches the observations, and so forth. Model weights can be used to constrain both the mean and the uncertainty in future projections of climate models.

In this study, we seek to combine different parameterizations of an idealized three-dimensional chaotic model of the Atlantic Meridional Overturning Circulation. One of the parameterizations plays the role of the observations. Each parameterization is evaluated online in a data assimilation framework using the EnKF by comparing the forecasts with the observations.

Traditional data assimilation procedures require access to the model equations, resulting in significant computational costs to run multiple model simulations to obtain forecasts at each time step. Here, a machine learning approach is implemented to provide the forecasts (i.e., analog forecasting). For each parameterization, the classical way of producing the forecasts is, in our case, replaced by an already existing catalog of trajectory time evolutions (e.g., long-term simulations), allowing to statistically emulate the model dynamics. This data-driven methodology retains the benefits given by the classical EnKF (i.e., optimal initial conditions, uncertainties consideration), at low computational costs. For each model-parameterization, a local performance metric (namely, the contextual model evidence) is computed at each time step in order to compare observations and model forecasts. This metric, based on the innovation likelihood, is sensitive to differences in the model dynamics and takes into account both the uncertainties of the forecasts and of the observations. To validate the methodology, different case studies are performed with various sensitivity tests (e.g., changing the parameterization used for the observations).

The results of the proposed weighting scheme on projections are discussed considering different quality metrics compared to benchmark methodologies. These include the equally weighting approach (also called the “model democracy”) and the direct comparison between the climatological probability distributions of simulations and observations.

How to cite: Le Bras, P., Ailliot, P., Le Carrer, N., Ruiz, J., Sévellec, F., and Tandeo, P.: Data-driven data assimilation to better characterize both accuracy and uncertainty of climate projections: a case study with an idealized chaotic AMOC model, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8578, https://doi.org/10.5194/egusphere-egu22-8578, 2022.

17:46–17:52
|
EGU22-10384
|
ECS
|
Presentation form not yet defined
Emilie Rouzies, Claire Lauvernet, and Arthur Vidard

Assessing pesticide transfers and fate in agricultural catchments is a major challenge to protect water ressources and aquatic organisms. To do so, physically-based, spatialized hydrological models are useful tools as they can be used to set up relevant mitigation strategies. The PESHMELBA model (Rouzies et al. 2019) is one such model that focuses on accurately simulating water and pesticide transfers both in the surface and the subsurface compartments of the soil. This model also aims at explicitely integrating and assessing the impact of landscape structures such as hedges, vegetative filter strips or ditches on transfers. To do so, the PESHMELBA model is characterized by a highly modular structure that relies on various code units standing for different physical processes in the different soil compartments. Such code units are thus coupled in a dedicated framework to reach a complete representation of the catchment with interacting processes. The resulting structure is quite complex and leads to significant difficulties to quantify and reduce the uncertainties associated to the simulation outputs.

In this study, we aim at setting a relevant data assimilation framework to reduce the uncertainty into the PESHMELBA coupled surface / subsurface water flow and reactive solute transport model. To do so, we test several data assimilation methods on hydrological and pesticide variables describing the catchment behavior. At first, these methods are implemented by combining the PESHMELBA model and surface moisture satellite images, at the small catchment scale. Different filtering and smoothing stochastic assimilation methods are explored: the Ensemble Kalman Filter, the Ensemble Smoother with Multiple Data Assimilation and the iterative Ensemble Kalman Smoother. Their abilities to retrieve moisture and pesticide concentration in the observed surface compartment but also in the deeper soil, that is not observed, are assessed. Furthermore, the conducted experiments also aim at retrieving some input parameters that characterize such different soil compartments.

Preliminary results on this part show that all tested methods only succeed in retrieving surface moisture. The Ensemble Smoother is shown to particulary outperform the other methods as it fully integrates the system dynamics. However, its performances are much more limited to retrieve moisture and input parameters in the deeper compartment due to poor correlations between the surface and the subsurface compartments. To overcome such limitation, other sources of data are gradually integrated in the DA framework. The process is proven successfull and we explore how the corrections from the DA process can propagate to other compartments such as the river streamflow and pesticide related variables .

How to cite: Rouzies, E., Lauvernet, C., and Vidard, A.: Which data assimilation method and data source for a multi-compartment hydrology/water quality model? Application on the PESHMELBA model in a small agricultural catchment, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10384, https://doi.org/10.5194/egusphere-egu22-10384, 2022.

17:52–17:58
|
EGU22-1560
|
ECS
|
Highlight
|
Virtual presentation
|
Yue Ying, Jeffrey Anderson, and Laurent Bertino

Position errors in coherent features have been a challenging problem for data assimilation (DA) due to their high nonlinearity. To effectively reduce position errors, a multiscale alignment (MSA) method was introduced to compute ensemble Kalman filter (EnKF) updates on a sequence of model states at low to high resolutions (large to small scales). Large-scale state has less nonlinearity due to position errors, therefore linear EnKF updates are optimal. The large-scale analysis increments are then utilized to compute the displacement vectors that warp the model grid, reduce position errors and precondition the state at smaller scales before the EnKF update is computed again. This study further tests the performance of the MSA method in an idealized vortex model. The asymptotic behavior is documented for a multiscale solution as number of scales (Ns) increases. We show that the optimal Ns depends on the degree of nonlinearity caused by the position errors. When feature-based observations (such as the vortex position) are used, the MSA performs well with Ns  3 no matter how large the position errors are. A challenging scenario is identified for the MSA method, when the large-scale background flow is incoherent with the small-scale vortex position error (deviation from coherence assumption). In cycling DA experiments, the MSA performs better than the traditional EnKF at equal cost (using decreased ensemble size for MSA to compensate for its increased cost when Ns >1), showing good scalability for real application and potential for improving prediction skill in many multiscale Earth systems.

How to cite: Ying, Y., Anderson, J., and Bertino, L.: Performance of the multiscale alignment ensemble filter in reducing vortex position errors, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1560, https://doi.org/10.5194/egusphere-egu22-1560, 2022.

17:58–18:04
|
EGU22-10717
|
ECS
|
On-site presentation
Amol Patil, Benjamin Fersch, Harrie-Jan Hendricks Franssen, and Harald Kunstmann

The Cosmic-Ray Neutron Sensing (CRNS) technology determines soil moisture for a few tens of hectares in a non-invasive way. These measurements, however, can be used to extend soil moisture characterization at regional scales using data assimilation. In the present study, we deployed the Ensemble Adjustment Kalman Filter (EAKF) to assimilate the CRNS neutron counts in order to update the spatial soil moisture, soil infiltration, and evapotranspiration parameters of the Noah-MP land surface model witch is also part of the WRF-Hydro modelling system. The study was conducted in the southern part of Germany, which includes the Rott and Ammer catchments within the TERENO Pre-Alpine observatory. The assimilation was carried out for both, a Noah-MP standalone scenario with observed rainfall as input and a coupled WRF-Hydro scenario with simulated rainfall to fully evaluate the added value of the assimilation. The assimilation performance was analysed at local and regional scale using independent soil moisture observations across the modelling domain. During the assimilation period, the Noah-MP standalone findings demonstrate a significant improvement in field scale soil moisture characterisation. The RMSE of simulated soil moisture was decreased by up to 66 % at field scale and up to 23 % at catchment scale. Additionally, the spatial patterns in the field scale soil moisture have showed improvement with reduction in spatial Bias by 0.025 cm3/cm3. The initial results from coupled WRF-Hydro scenario demonstrate that the soil moisture and parameter estimation experiment had a significant impact on estimated soil moisture and, humidity and evapotranspiration at regional scale. These findings support the use of the CRNS technique to improve the land surface and coupled hydro-atmospheric modelling.

How to cite: Patil, A., Fersch, B., Hendricks Franssen, H.-J., and Kunstmann, H.: Improved soil moisture-atmospheric boundary layer interactions by assimilation of Cosmic-Ray Neutron counts, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10717, https://doi.org/10.5194/egusphere-egu22-10717, 2022.

18:04–18:10
|
EGU22-5424
|
ECS
|
Highlight
|
On-site presentation
|
Ondřej Tichý, Nikolaos Evangeliou, and Václav Šmídl
The goal of this contribution is to explore two-stage inversion algorithm for spatio-temporal emission estimation (2D and time) from deposition measurements of microplastics and microfibers from Western USA. We consider the linear inversion model formulated as y = M x , where y is the measurement vector, M is source-receptor-sensitivity matrix computed using Lagrangian particle dispersion model FLEXPART, and x is the unknown emission vector from given spatial element. The inverse problem is typically ill-conditioned due to the measurements sparsity, hence, we propose two stage algorithm for inversion of this type. First, we run the inversion algorithm for the whole spatial domain, hence, we obtain averaged emission from each spatial element of the considered spatial domain. Second, we use the estimated emission from the first step (common for all spatial elements) as a prior emission in the second step where the inversion problem is considered for each spatial element separately. We demonstrate that this approach regularizes the inversion problem of spatio-temporal emission from sparse measurements, concretely on microplastics and microfibers emission estimation in Western USA.

How to cite: Tichý, O., Evangeliou, N., and Šmídl, V.: Two stage inversion method for microplastics emission estimation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5424, https://doi.org/10.5194/egusphere-egu22-5424, 2022.

18:10–18:30