Displays

NP5.1

Data Assimilation, Predictability, Error Identification and Uncertainty Quantification in Geosciences

Inverse Problems are encountered in many fields of geosciences. One class of inverse problems, in the context of predictability, is assimilation of observations in dynamical models of the system under study. Furthermore, objective quantification of the uncertainty on the results obtained is the object of growing concern and interest.

This session will be devoted to the presentation and discussion of methods for inverse problems, data assimilation and associated uncertainty quantification, in ocean and atmosphere dynamics, atmospheric chemistry, hydrology, climate science, solid earth geophysics and, more generally, in all fields of geosciences.

We encourage presentations on advanced methods, and related mathematical developments, suitable for situations in which local linear and Gaussian hypotheses are not valid and/or for situations in which significant model
errors are present. We also welcome contributions dealing with algorithmic aspects and numerical implementation of the solution of inverse problems and quantification of the associated uncertainty, as well as novel methodologies at the crossroad between data assimilation and purely data-driven, machine-learning-type algorithms.

Invited speakers:
Luca Cantarello (University of Leeds)
Jean-Michel Brankart (University of Grenoble)

Public information:
In the session we will encourage all participants to present their work. These brief presentations will last about 5 minutes.

Share:

Convener: Javier Amezcua | Co-conveners: Natale Alberto Carrassi, Tijana Janjic, Olivier Talagrand

Displays

| Attendance Tue, 05 May, 08:30–10:15 (CEST)

Files for download

Download all presentations (34MB)

Chat time: Tuesday, 5 May 2020, 08:30–10:15

Chairperson: Javier Amezcua

D2841 |

EGU2020-332

| Highlight

Idealised satellite data assimilation experiments with clouds and precipitation

Luca Cantarello, Onno Bokhove, Gordon Inverarity, Stefano Migliorini, and Steve Tobias

Operational data assimilation (DA) schemes rely significantly on satellite observations with much research aimed at their optimisation, leading to a great deal of progress. Here, we investigate the impact of the spatial-temporal variability of satellite observations for DA: is there a case for concentrating effort into the assimilation of small-scale convective features over the large-scale dynamics, or vice versa?

We conduct our study in an isentropic one-and-a-half layer model that mimics convection and precipitation, a revised and more realistic version of the idealised model based on the shallow water equations in [1,2]. Forecast-assimilation experiments are performed by means of a twin-setting configuration, in which pseudo-observations from a high-resolution nature run are combined with lower-resolution forecasts. The DA algorithm used is the deterministic Ensemble Kalman Filter (see [3]). We focus our research on polar-orbit satellites regarding emitted microwave radiation.

We have developed a new observation operator and a representative observing system in which both ground and satellite observations can be assimilated. The convection thresholds in the model are used as a proxy for cloud formation, clouds, and precipitation. To imitate the use of weighting functions in real satellite applications, radiance values are computed as a weighted sum with contributions from both layers. In the presence of clouds and/or precipitation, we model the response of passive microwave radiation to either precipitating or non-precipitating clouds. The horizontal resolution of satellite observations can be varied to investigate the impact of scale-dependency on the analysis.

New, preliminary results from experiments including both transverse jets and rotation in a periodic domain will be reported and discussed.

References:

[1] Kent, T., Bokhove, O., & Tobias, S. (2017). Dynamics of an idealized fluid model for investigating convective-scale data assimilation. Tellus A: Dynamic Meteorology and Oceanography, 69(1), 1369332.

[2] Kent, T. (2016). An idealised fluid model for convective-scale NWP: dynamics and data assimilation (Doctoral dissertation, PhD Thesis, University of Leeds).

[3] Sakov, P., & Oke, P. R. (2008). A deterministic formulation of the ensemble Kalman filter: an alternative to ensemble square root filters. Tellus A: Dynamic Meteorology and Oceanography, 60(2), 361-371.

How to cite: Cantarello, L., Bokhove, O., Inverarity, G., Migliorini, S., and Tobias, S.: Idealised satellite data assimilation experiments with clouds and precipitation, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-332, https://doi.org/10.5194/egusphere-egu2020-332, 2020.

D2842 |

EGU2020-2182

| Highlight

Implicitly localized MCMC sampler to cope with nonlocal/nonlinear data constraints in large-size inverse problems

Jean-Michel Brankart

Many practical applications involve the resolution of large-size inverse problems, without providing more than a moderate-size sample to describe the prior probability distribution. In this situation, additional information must be supplied to augment the effective dimension of the available sample, for instance using a covariance localization approach. In this study, it is suggested that covariance localization can be efficiently applied to an approximate variant of the Metropolis/Hastings algorithm, by modulating the ensemble members by the large-scale patterns of other members. Modulation is used to design a (global) proposal probability distribution (i) that can be sampled at a very low cost, (ii) that automatically accounts for a localized prior covariance, and (iii) that leads to an efficient sampler for the augmented prior probability distribution or for the posterior probability distribution. The resulting algorithm is applied to an academic example, illustrating (i) the effectiveness of covariance localization, (ii) the ability of the method to deal with nonlocal/nonlinear observation operators and non-Gaussian observation errors, (iii) the reliability, resolution and optimality of the updated ensemble, using probabilistic scores appropriate to a non-Gaussian posterior distribution, and (iv) the scalability of the algorithm as a function of the size of the problem. The codes are openly available from github.com/brankart/ensdam.

How to cite: Brankart, J.-M.: Implicitly localized MCMC sampler to cope with nonlocal/nonlinear data constraints in large-size inverse problems, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-2182, https://doi.org/10.5194/egusphere-egu2020-2182, 2020.

D2843 |

EGU2020-19831

| Highlight

Development of a nonlinear ensemble data assimilation method with global state-space covariance localization

not presented

Milija Zupanski

High-dimensional ensemble data assimilation applications require error covariance localization in order to address the problem of insufficient degrees of freedom, typically accomplished using the observation-space covariance localization. However, this creates a challenge for vertically integrated observations, such as satellite radiances, aerosol optical depth, etc., since the exact observation location in vertical does not exist. For nonlinear problems, there is an implied inconsistency in iterative minimization due to using observation-space localization which effectively prevents finding the optimal global minimizing solution. Using state-space localization, however, in principal resolves both issues associated with observation space localization.

In this work we present a new nonlinear ensemble data assimilation method that employs covariance localization in state space and finds an optimal analysis solution. The new method resembles “modified ensembles” in the sense that ensemble size is increased in the analysis, but it differs in methodology used to create ensemble modifications, calculate the analysis error covariance, and define the initial ensemble perturbations for data assimilation cycling. From a practical point of view, the new method is considerably more efficient and potentially applicable to realistic high-dimensional data assimilation problems. A distinct characteristic of the new algorithm is that the localized error covariance and minimization are global, i.e. explicitly defined over all state points. The presentation will focus on examining feasible options for estimating the analysis error covariance and for defining the initial ensemble perturbations.

How to cite: Zupanski, M.: Development of a nonlinear ensemble data assimilation method with global state-space covariance localization, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19831, https://doi.org/10.5194/egusphere-egu2020-19831, 2020.

D2844 |

EGU2020-22521

| Highlight

A framework for causality under data assimilation

Nachiketa Chakraborty, Peter Jan van Leeuwen, Michael de Caria, and Manuel Pulido

D2845 |

EGU2020-8979

Spatio-temporal Inversion using the Selection Kalman Model

not presented

Maxime Conjard and Henning Omre

The challenge in data assimilation for models representing spatio-temporal phenomena is made harder when the spatial histogram of the variable of interest appears with multiple modes. Pollution source identification constitutes one example where the pollution release represents an extreme event in a fairly homogeneous background. Consequently, our prior belief is that the spatial histogram is bimodal. The traditional Kalman model is based on a Gaussian initial distribution and Gauss-linear dynamic and observation models. This model is contained in the class of Gaussian distribution and is therefore analytically tractable. These properties that make its strenght also render it unsuitable for representing multimodality. To address the issue, we define the selection Kalman model. It is based on a selection-Gaussian initial distribution and Gauss-linear dynamic and observation models. The selection-Gaussian distribution may represent multimodality, skewness and peakedness. It can be seen as a generalization of the Gaussian distribution. The proposed selection Kalman model is contained in the class of selection-Gaussian distributions and therefore analytically tractable. The recursive algorithm used for assessing the selection Kalman model is specified. We present a synthetic case study of spatio-temporal inversion of an initial state containing an extreme event. The study is inspired by pollution monitoring. The results suggest that the use of the selection Kalman model offers significant improvements compared to the traditional Kalman model when reconstructing discontinuous initial states.

How to cite: Conjard, M. and Omre, H.: Spatio-temporal Inversion using the Selection Kalman Model, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8979, https://doi.org/10.5194/egusphere-egu2020-8979, 2020.

D2846 |

EGU2020-5772

p-norm regularization in variational data assimilation

Antoine Bernigaud, Serge Gratton, Flavia Lenti, Ehouarn Simon, and Oumaima Sohab

We introduce a new formulation of the 4DVAR objective function by using as a penalty term a p-norm with 1 < p < 2. So far, only the 2-norm, the 1-norm or a mixed of both have been considered as regularization term. This approach is motivated by the nature of the problems encountered in data assimilation, for which such a norm may be more suited to tackle the distribution of the variables. It also aims at making a compromise between the 2-norm that tends to oversmooth the solution or produce Gibbs oscillations, and the 1-norm that tends to "oversparcify" it, in addition to making the problem non-smooth.

The performance of the proposed technique are assessed for different p-values by twin experiments on a linear advection equation. The experiments are then conducted using two different true states in order to assess the performances of the p-norm regularized 4DVAR algorithm in sparse (rectangular function) and "almost" sparse cases (rectangular function with a smoother slope). In this setup, the background and the measurements noise covariance are known.

In order to minimize the 4DVAR objective function with a p-norm as a regularization term we use a gradient descent algorithm that requires the use of duality operators to work on a non-euclidean space. Indeed, Rn together with the p-norm (1 < p < 2) is a Banach space. Finally, to tune the regularization parameter appearing in the formulation of the objective function, we use the Morozov's discrepancy principle.

How to cite: Bernigaud, A., Gratton, S., Lenti, F., Simon, E., and Sohab, O.: p-norm regularization in variational data assimilation, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5772, https://doi.org/10.5194/egusphere-egu2020-5772, 2020.

D2847 |

EGU2020-285

| Highlight

Learning missing part of physics-based models within a variational data assimilation scheme

Arthur Filoche, Julien Brajard, Anastase Charantonis, and Dominique Béréziat

The analogy between data assimilation and machine learning has already been shown and is still being investigated to address the problem of improving physics-based models. Even though both techniques learn from data, machine learning focuses on inferring model parameters while data assimilation concentrates on hidden system state estimation with the help of a dynamical model.

Also, neural networks and more precisely ResNet-like architectures can be seen as dynamical systems and numerical schemes, respectively. They are now considered state of the art in a vast amount of tasks involving spatio-temporal forecasting. But to train such networks, one needs dense and representative data which is rarely the case in earth sciences. At the same time, data assimilation offers a proper Bayesian framework allowing to learn from partial, noisy and indirect observations. Thus, each of this field can profit from the other by providing either a learnable class of dynamical models or dense data sets.

In this work, we benefit from powerful and flexible tools provided by the deep learning community based on automatic differentiation that are clearly suitable for variational data assimilation, avoiding explicit adjoint modelling. We use a hybrid model divided into 2 terms. The first term is a numerical scheme that comes from the discretisation of physics-based equations, the second is a convolutional neural network that represents the unresolved part of the dynamics. From the Data Assimilation point of view, our network can be seen as a particular parametrisation of the model error. We then jointly learn this parameterisation and estimate hidden system states within a variational data assimilation scheme. Indirectly, the issue of incorporating physical knowledge into machine learning models is also addressed.

We show that the hybrid model improves forecast skill compared to traditional data assimilation techniques. The generalisation of the method on different models and data will also be discussed.

How to cite: Filoche, A., Brajard, J., Charantonis, A., and Béréziat, D.: Learning missing part of physics-based models within a variational data assimilation scheme, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-285, https://doi.org/10.5194/egusphere-egu2020-285, 2020.

D2848 |

EGU2020-559

Effect of inaccurate specification of time-correlated model error in an Ensemble Smoother

Haonan Ren, Peter Jan Van Leeuwen, and Javier Amezcua

Data assimilation has been often performed under the perfect model assumption known as the strong-constraint setting. There is an increasing number of researches accounting for the model errors, the weak-constrain setting, but often with different degrees of approximation or simplification without knowing their impact on the data assimilation results. We investigate what effect inaccurate model errors, in particular, the an inaccurate time correlation, can have on data assimilation results, with a Kalman Smoother and the Ensemble Kalman Smoother.
We choose a linear auto-regressive model for the experiment. We assume the true state of the system has the correct and fixed correlation time-scale ω_r in the model errors, and the prior or the background generated by the model contains the model error with the fixed, guessed time-scale ω_g which differs from the correct one and is also used in the data assimilation process. There are 10 variables in the system and we separate the simulation period into multiple time-windows. And we use a fairly large ensemble size (up to 200 ensemble members) to improve the accuracy of the data assimilation results. In order to evaluate the performance of the EnKS with auto-correlated model errors, we calculate the ratio of root-mean-square error over the spread of all ensemble members.
The results with a single observation at the end of the simulation time-window show that, using an underestimated correlation time-scale leads to overestimated spread of the ensemble, and with an overestimated time-scale, the results show underestimation in the ensemble spread. However, with very dense observation frequency, observing every time-step for instance, the results are completely opposite to the results with a single observation. In order to understand the results, we derive the expression for the true posterior state covariance and the posterior covariance using the incorrect decorrelation time-scale. We do this for a Kalman Smoother to avoid the sampling uncertainties. The results are richer than expected and highly dependent on the observation frequency. From the analytical solution of the analysis, we find that the RMSE is a function of both ω_rand ω_g, and the spread or the variance only depends on ω_g. We also find that the analyzed variance is not always a monotonically increasing function of ω_g, and it also depends on the observation frequency. In general, the results show the effect of the correlated model error and the incorrect correlation time-scale on data assimilation result, which is also affected by the observation frequency.

How to cite: Ren, H., Van Leeuwen, P. J., and Amezcua, J.: Effect of inaccurate specification of time-correlated model error in an Ensemble Smoother, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-559, https://doi.org/10.5194/egusphere-egu2020-559, 2020.

D2849 |

EGU2020-10752

Data assimilation framework around the LPJ-GUESS model for the optimised simulation of CH4 emission from Northern wetlands

Jalisha Theanutti Kallingal, Marko Scholze, Janne Rinne, and Johan Lindstrom

Wetlands in the boreal zone are a significant source of atmospheric methane, and hence they have been intensively studied with mechanistic models for the assessment of methane dynamics. The arctic-enabled dynamic global vegetation model LPJ-GUESS is one of the models that allow quantification and understanding of the natural methane fluxes at various scales ranging from local to regional and global, but with several uncertainties. Complexity in the underlying environmental processes, warming driven alternative paths of meteorological phenomena and changes in hydrological and vegetation conditions are exigent for a calibrated and optimised LPJ-GUESS. In this study, we used the Markov chain Monte Carlo (using Metropolis-Hastings formula) algorithm to quantify the uncertainties of LPJ-GUESS. Application of this method allows greater search of the posterior distribution, leading to a more complete characterisation of the posterior distribution with reduced risk of sample impoverishment. We will present first results from an assimilation experiment optimising LPJ-GUESS model process parameters using the flux measurement data from 2005 to 2015 from the Siikaneva wetlands in southern Finland. We analyse the parameter efficiency of LPJ-GUESS by looking into the posterior parameter distributions, parameter correlations, and the interconnections of the processes they control. As a part of this work, knowledge about how the methane data can constrain the parameters and processes is derived.

How to cite: Theanutti Kallingal, J., Scholze, M., Rinne, J., and Lindstrom, J.: Data assimilation framework around the LPJ-GUESS model for the optimised simulation of CH4 emission from Northern wetlands, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-10752, https://doi.org/10.5194/egusphere-egu2020-10752, 2020.

D2850 |

EGU2020-5717

Reducing the memory requirements of parameter estimation using model order reduction

Martin Verlaan, Xiaohui Wang, and Hai Xiang Lin

Previous development of a parameter estimation scheme for a Global Tide and Surge Model (GTSM) showed that accurate estimation of the parameters is currently limited by the memory use of the analysis step and the computational demand. Because the estimation algorithm solver requires storage of the model output matching each observation for each parameter (or ensemble member), the requirement of memory storage gets out of control as the model simulation time increases, the model output and observation matrix become too large. The popular approach of localization does not work here because the tides propagate all over the globe in days, while parameter estimation requires weeks at least. Proper Orthogonal Decomposition (POD) is a useful technique to reduce the high dimension system with a smaller linear subspace. Singular values decomposition (SVD) is one of the methods to derive the POD modes, which is generally applied for space patterns. In this study, we focus on the application of POD in time patterns by using SVD to reduce the dimension in time patterns. As expected, the time patterns show a strong resemblance to the tidal constituents, but the same method is likely to work for a wider range of problems, which indicate that the memory requirements can be reduced dramatically by projection the model output and observations onto the time-POD patterns.

How to cite: Verlaan, M., Wang, X., and Lin, H. X.: Reducing the memory requirements of parameter estimation using model order reduction, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5717, https://doi.org/10.5194/egusphere-egu2020-5717, 2020.

Discussion

D2851 |

EGU2020-3128

| Highlight

The Data Assimilation Research Testbed: Nonlinear Algorithms and Novel Applications for Community Ensemble Data Assimilation

not presented

Jeffrey Anderson, Nancy Collins, Moha El Gharamti, Timothy Hoar, Kevin Raeder, Frederic Castruccio, Jingjing LIang, John Lin, James McCreight, Seongjin Noh, Brett Raczka, and Arezoo Arezoo Rfieeinasab

The Data Assimilation Research Testbed (DART) is a community facility for ensemble data assimilation developed and maintained by the National Center for Atmospheric Research (NCAR). DART provides ensemble data assimilation capabilities for NCAR community earth system models and many other prediction models. It is straightforward to add interfaces for new models and new observations to DART.

DART provides traditional ensemble data assimilation algorithms that implicitly assume Gaussianity and linearity. Traditional algorithms can still work when these assumptions are violated. However, it is possible to greatly improve results by extending ensemble algorithms to explicitly account for aspects of nonlinearity and non-Gaussianity. Two new algorithms have been added to DART. 1). Anamorphosis transforms variables to make the assimilation problem more linear and Gaussian before transforming posterior estimates back to the original model variables; 2). The marginal correction rank histogram filter (MCRHF) directly represents arbitrary non-Gaussian distributions. These methods are particularly valuable for data assimilation for bounded quantities like tracers or streamflow.

DART is being applied to a number of novel applications. Examples in the poster include 1). An eddy-resolving global ocean ensemble reanalysis with the POP ocean model and an ensemble optimal interpolation; 2). The WRF-Hydro/DART system now includes a multi-parametric ensemble, anamorphosis, and spatially-correlated noise for the forcing fields. 3). Results from the Carbon Monitoring System over Mountains using CLM5 to assimilate remotely-sensed observations (LAI, biomass, and SIF) for a field site in Colorado; 4). Assimilation of MODIS snow cover fraction and daily GRACE total water storage data and its impact on soil moisture using the DART/NOAH-MP system. 5). An ensemble atmospheric reanalysis using the CAM general circulation model.

How to cite: Anderson, J., Collins, N., El Gharamti, M., Hoar, T., Raeder, K., Castruccio, F., LIang, J., Lin, J., McCreight, J., Noh, S., Raczka, B., and Arezoo Rfieeinasab, A.: The Data Assimilation Research Testbed: Nonlinear Algorithms and Novel Applications for Community Ensemble Data Assimilation, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3128, https://doi.org/10.5194/egusphere-egu2020-3128, 2020.

D2852 |

EGU2020-6121

Development of Ensemble-based Assimilation System for Aerosol Forecasting and Reanalysis at NOAA

Mariusz Pagowski, Cory Martin, Bo Huang, Daryl Kleist, and Shobha Kondragunta

D2853 |

EGU2020-6366

Optimal Error Analysis of MJO Prediction Associated with Uncertainties in Sea Surface Temperature over Indian Ocean

not presented

Xiaojing Li and Youmin Tang

In this study, the predictability of the Madden-Julian Oscillation (MJO) is investigated using the coupled Community Earth System Model (CESM) and the climatically relevant singular vector (CSV) method. The CSV method is an ensemble-based strategy to calculate the optimal growth of the initial error on the climate scale. We focus on the CSV analysis of MJO initialized at phase II, facilitating the investigation of the effect of the initial errors of the sea surface temperature (SST) in the Indian Ocean on it. Six different MJO events are chosen as the study cases to ensure the robustness of the results.

The results indicate that for all the study cases, the optimal perturbation structure of the SST, denoted by the leading mode of the singular vectors (SVs), is a meridional dipole-like pattern between the Bay of Bengal and the southern central Indian Ocean. The MJO signal tends to be more converged and significant in the Eastern Hemisphere while the model is perturbed by leading SV. The moist static energy analysis results indicate that the eastward propagation is much more evident in the terms of vertical advection and radiation flux than others. Therefore, the SV perturbation can strengthen and converge the MJO signal mostly by increasing the vertical advection of the moist static energy.

Further, the sensitivity studies indicate that the structure of the leading SV is not sensitive to the initial states, which suggests that we might not need to calculate SVs for each initial time in constructing the ensemble prediction, significantly saving computational time in the operational forecast systems.

How to cite: Li, X. and Tang, Y.: Optimal Error Analysis of MJO Prediction Associated with Uncertainties in Sea Surface Temperature over Indian Ocean, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-6366, https://doi.org/10.5194/egusphere-egu2020-6366, 2020.

D2854 |

EGU2020-15879

Towards non-linear inverse problem for atmospheric source term determination

Ondřej Tichý and Václav Šmídl

The basic linear inverse problem of atmospheric release can be formulated as y = M x + e , where y is the measurement vector which is typically in the form of gamma dose rates or concentrations, M is the source-receptor-sensitivity (SRS) matrix, x is the unknown source term to be estimated, and e is the model residue. The SRS matrix M is computed using an atmospheric transport model coupled with meteorological reanalyses. The inverse problem is typically ill-conditioned due to number of uncertainties, hence, the estimation of the source term is not straightforward and additional information, e.g. in the form of regularization or the prior source term, is often needed. Besides, traditional techniques rely on assumption that the SRS matrix is correct which is not realistic due to the number of approximations made during its computation. Therefore, we propose relaxation of the inverse model using introduction of the term Δ_M such as y = ( M+ Δ_M ) x + e leading to non-linear inverse problem formulation, where Δ_M can be, as an example, parametric perturbation of the SRS matrix M in the spatial or temporal domain. We estimate parameters of this perturbation together with solving the inverse problem using variational Bayes procedure. The method will be validated on synthetic dataset as well as demonstrated on real case scenario such as the controlled tracer experiment ETEX or episode of ruthenium-106 release over the Europe on the fall of 2017.

How to cite: Tichý, O. and Šmídl, V.: Towards non-linear inverse problem for atmospheric source term determination, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-15879, https://doi.org/10.5194/egusphere-egu2020-15879, 2020.

D2855 |

EGU2020-18771

LOTOS-EUROS 4DEnVar Data Assimilation using TROPOMI data for Colombia

not presented

Andres Yarce, Santiago Lopez, Diego Acosta, Olga Lucia Quintero, Nicolas Pinel, Arjo Segers, and Arnold Heemink

Chemical Transport Models (CTMs) simulate the emission, transformation, and transport of atmospheric chemical species, providing concentration and deposition estimates. While greatly sophisticated, these are still imperfect representations of reality. Data Assimilation (DA), a technique whereby observations are integrated into the simulations, helps alleviate the models' weaknesses, improving their simulation outputs and enabling parameter and state estimation. The variational DA method is an efficient approach for large-scale parameter and state estimation, but it is not straightforward to implement due to the need for a tangent linear matrix of the adjoint model forecast operator. To circumvent this difficulty, the ensemble-based 4DEnVar DA technique was used in this work.

Daily NO₂ observations from the TROPOspheric Monitoring Instrument (TROPOMI) at resolutions of 3x5 km were acquired for 2019 and assimilated into the LOTOS-EUROS CTM. Due to the scarcity of ground-based monitoring stations for atmospheric gases in Colombia, especially outside urban areas, satellite data provide an attractive alternative for DA.

The 4DEnVar DA was first evaluated via the Design of Experiments (DOE) methodology with the Lorenz96 model assimilating synthetic data. Different parameters were changed (ensemble number, spread, forcing factor and width of the assimilation time window) according to a complete 2⁴ factorial design followed by a Box Behnken design, providing an empirical model that guided the selection about how to modify those tuning parameters. The evaluation criteria used to test the 4DEnVar DA performance was the Root-Mean-Square (RMS) error between the analysis step and the synthetic data. Once this methodology was implemented, it was scaled up to the high-dimensional LOTOS-EUROS experiment.

The setup for the LOTOS-EUROS DA experiment was simplified in terms of domain area, chemical species of interest, dominant dynamics and considerations about how to perturb the parameters or initial conditions. A range of ensemble-members generated from perturbed parameters or input initial states were studied in conjunction with ensemble inflation experiments and Singular Value Decomposition projections, characterizing the degeneracy of the Gaussian assumption through the time propagation of the ensemble. Additionally, a complimentary analysis of this Gaussian ensemble degeneration was performed using the Shapiro-Wilk and Kolmogorov-Smirnov normality tests, which permitted a rational selection of the spin-up time of the model before the start of the assimilation window and the DA window size.

The assimilation of satellite NO₂ observations into LOTOS-EUROS made possible the estimation of parameters and states. Before the DA, the non-assimilated model overestimated the magnitude of the observation, this technique improves the simulation in the sense that the analysis result approaches the observations reducing the RMS. Through this methodology, it was possible to circumvent the absence of an adjoint model associated with the chemical components of this CTM. To our knowledge, this is the first application of ensemble variational DA on a CTM for the Northwestern South America region.

How to cite: Yarce, A., Lopez, S., Acosta, D., Quintero, O. L., Pinel, N., Segers, A., and Heemink, A.: LOTOS-EUROS 4DEnVar Data Assimilation using TROPOMI data for Colombia, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18771, https://doi.org/10.5194/egusphere-egu2020-18771, 2020.

D2856 |

EGU2020-18684

Quantifying uncertainty in the ESA Ocean Colour – Climate Change Initiative dataset for assimilation of total chlorophyll and phytoplankton functional types

not presented

Alison Fowler, Jozef Skákala, and Stefano Ciavatta

Monitoring biogeochemistry in shelf seas is of great significance for the economy, ecosystems understanding and climate studies. Data assimilation can aid the realism of marine biogeochemistry models by incorporating information from observations. An important source of information about phytoplankton groups and total chlorophyll is available from the ESA OC-CCI (ocean colour - climate change initiative) dataset.

For any assimilation system to be successful it is important to accurately represent all sources of data uncertainty. For the ocean colour product, the propagation of errors throughout the ocean colour algorithm makes the characterisation of the uncertainty challenging. However, the problem can be simplified by assuming that the uncertainty is a function of optical water type (OWT), which characterises the water column of each observed pixel in terms of their reflectance properties.

Within this work we apply the well-known Desroziers et al. (2005) consistency diagnostics to the Met Office’s NEMOVAR 3D-VAR DA system used to create daily biogeochemistry forecasts on the North-West European Shelf. The derived estimates of monthly ocean colour error covariances stratified by OWT are compared to previously derived estimates of the root mean square errors and biases using in-situ data match ups (Brewin et al. 2017). It is found that the agreement between the two estimates of the error variances have a strong seasonal and OWT dependence. The error correlations (which can only be estimated with the Desroziers’ method) in some instances are found to be significant out to a few 100km particularly for more turbid waters during the spring bloom. The reliability and limitation of these two estimates of the ocean colour uncertainty are discussed along with the implications for the future assimilation of ocean colour products and for ecosystem and climate studies.

How to cite: Fowler, A., Skákala, J., and Ciavatta, S.: Quantifying uncertainty in the ESA Ocean Colour – Climate Change Initiative dataset for assimilation of total chlorophyll and phytoplankton functional types, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18684, https://doi.org/10.5194/egusphere-egu2020-18684, 2020.

D2857 |

EGU2020-20271

Accounting for model error in atmospheric forecasts

William Crawford, Sergey Frolov, Justin McLay, Carolyn Reynolds, Craig Bishop, Benjamin Ruston, and Neil Barton

D2858 |

EGU2020-7163

Combined state-parameter estimation with the LETKF for convective-scale weather forecasting

Yvonne Ruckstuhl and Tijana Janjic

We investigate the feasibility of addressing model error by perturbing and estimating uncertain static model parameters using the localized ensemble transform Kalman filter. In particular we use the augmented state approach, where parameters are updated by observations via their correlation with observed state variables. This online approach offers a flexible, yet consistent way to better fit model variables affected by the chosen parameters to observations, while ensuring feasible model states. We show in a nearly-operational convection-permitting configuration that the prediction of clouds and precipitation with the COSMO-DE model is improved if the two dimensional roughness length parameter is estimated with the augmented state approach. Here, the targeted model error is the roughness length itself and the surface fluxes, which influence the initiation of convection. At analysis time, Gaussian noise with a specified correlation matrix is added to the roughness length to regulate the parameter spread. In the northern part of the COSMO-DE domain, where the terrain is mostly flat and assimilated surface wind measurements are dense, estimating the roughness length led to improved forecasts of up to six hours of clouds and precipitation. In the southern part of the domain, the parameter estimation was detrimental unless the correlation length scale of the Gaussian noise that is added to the roughness length is increased. The impact of the parameter estimation was found to be larger when synoptic forcing is weak and the model output is more sensitive to the roughness length.

How to cite: Ruckstuhl, Y. and Janjic, T.: Combined state-parameter estimation with the LETKF for convective-scale weather forecasting, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-7163, https://doi.org/10.5194/egusphere-egu2020-7163, 2020.

Discussion

D2859 |

EGU2020-15517

Bayesian inference of dynamics from partial and noisy observations using data assimilation and machine learning

Marc Bocquet, Julien Brajard, Alberto Carrassi, and Laurent Bertino

The reconstruction from observations of the dynamics of high-dimensional chaotic models such as geophysical fluids is hampered by (i) the inevitably partial and noisy observations that can realistically be obtained, (ii) the need and difficulty to learn from long time series of data, and (iii) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has recently been suggested to combine data assimilation and machine learning in several ways. We first rigorously show how to unify these approaches from a Bayesian perspective, yielding a non-trivial loss function.

Existing techniques to optimize the loss function (or simplified variants thereof) are re-interpreted here as coordinate descent schemes. The expectation-maximization (EM) method is used to estimate jointly the most likely model and model error statistics. The main algorithm alternates two steps: first, a posterior ensemble is derived using a traditional data assimilation step using an ensemble Kalman smoother (EnKS); second, both the surrogate model and the model error are updated using machine learning tools, a quasi-Newton optimizer, and analytical formula. In our case, the spatially extended surrogate model is formalized as a neural network with convolutional layers leveraging on the locality of the dynamics.

This scheme has been successfully tested on two low-order chaotic models with distinct identifiability, namely the 40-variable and the two-scale Lorenz models. Additionally, an approximate algorithm is tested to mitigate the numerical cost, yielding similar performances. Using indicators that probe short-term and asymptotic properties of the surrogate model, we investigate the sensitivity of the inference to the length of the training window, to the observation error magnitude, to the density of the monitoring network, and to the lag of the EnKS. In these iterative schemes, model error statistics are automatically adjusted to the improvement of the surrogate model dynamics. The outcome of the minimization is not only a deterministic surrogate model but also its associated stochastic correction, representative of the uncertainty attached to the deterministic part and which accounts for residual model errors.

How to cite: Bocquet, M., Brajard, J., Carrassi, A., and Bertino, L.: Bayesian inference of dynamics from partial and noisy observations using data assimilation and machine learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-15517, https://doi.org/10.5194/egusphere-egu2020-15517, 2020.

D2860 |

EGU2020-2686

A new method for computing horizontally anisotropic background error covariance matrices for data assimilation in ocean models.

not presented

Jose M Gonzalez-Ondina, Lewis Sampson, and Georgy Shapiro

Current operational ocean modelling systems often use variational data assimilation (DA) to improve the skill of the ocean predictions by combining the numerical model with observational data. Many modern methods are derivatives of objective (optimal) interpolation techniques developed by L. S. Gandin in the 1950s, which requires computation of the background error covariance matrix (BECM), and much research has been devoted into overcoming the difficulties surrounding its calculation and improving its accuracy. In practice, due to time and memory constraints, the BECM is never fully computed. Instead, a simplified model is used, where the correlation at each point is modelled using a simple function while the variance and length scales are computed using error estimation methods such as the Hollingsworth-Lonnberg or the NMC (National Meteorological Centre). Usually, the correlation is assumed to be horizontally isotropic, or to have a predefined anisotropy based on latitude. However, observations indicate that horizontal diffusion is sometimes anisotropic, hence this has to be propagated into BECM. It is suggested that including these anisotropies would improve the accuracy of the model predictions.

We present a new method to compute the BECM which allows to extract horizontal anisotropic components from observational data. Our method, unlike current techniques, is fundamentally multidimensional and can be applied to 2D or 3D sets of un-binned data. It also works better than other methods when observations are sparse, so there is no penalty when trying to extract the additional anisotropic components from the data.

Data Assimilation tools like NEMOVar use a matrix decomposition technique for the BECM in order to minimise the cost function. Our method is well suited to work with this type of decomposition, producing the different components of the decomposition which can be readily used by NEMOVar.

We have been able to show the spatial stability of our method to quantify anisotropy in areas of sparse observations. While also demonstrating the importance of including anisotropic representation within the background error. Using the coastal regions of the Arabian Sea, it is possible to analyse where improvements to diffusion can be included. Further extensions of this method could lead to a fully anisotropic diffusion operator for the calculation of BECM in NEMOVar. However further testing and optimization are needed to correctly implement this into operational assimilation systems.

How to cite: Gonzalez-Ondina, J. M., Sampson, L., and Shapiro, G.: A new method for computing horizontally anisotropic background error covariance matrices for data assimilation in ocean models. , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-2686, https://doi.org/10.5194/egusphere-egu2020-2686, 2020.

D2861 |

EGU2020-6241

On the modification of operational oceanography forecasting system for South China Sea in National Marine Environmental Forecasting Center of China

Ziqing Zu, Xueming Zhu, and Hui Wang

D2862 |

EGU2020-6596

A Flow-dependent Targeted Observation Method

Youmin Tang and Yaling Wu