G1.3

Data science and machine learning in geodesy

G1.3

Data science and machine learning in geodesy

Convener: Benedikt Soja | Co-conveners: Kyriakos BalidakisECSECS, Maria KaselimiECSECS, Randa NatrasECSECS, Mattia Crespi

Presentations

| Fri, 27 May, 08:30–10:00 (CEST)

Room -2.16

Presentations: Fri, 27 May | Room -2.16

Chairpersons: Benedikt Soja, Randa Natras, Kyriakos Balidakis

08:30–08:33

Introduction

Machine learning theory and methodology

08:33–08:39

EGU22-7272

ECS

Presentation form not yet defined

Least-squares-based formulation of deep learning: Theory and applications to geoscience data analytics

Alireza Amiri-Simkooei

As a specific family of machine learning algorithms, deep learning (DL), successfully applied to several application areas is a relatively new and novel methodology receiving much attention. The DL has been widely applied to a series of problems including email filtering, image and speech recognition, and language processing, but is only beginning to impact on geoscience problems. On the other hand, the standard least-squares (SLS) theory of linear models has been widely used in many earth science areas. This theory connects the explanatory variables to the predicted ones, called observations, through a linear(ized) model in which the unknowns of this relation are estimated using the least squares method. The design matrix, containing the explanatory variables of a set of objects, is usually linearly related to the predicted variables. There are however applications that the predicted variables are unknown (nonlinear) functions of explanatory variables, and hence such a design matrix is not known a-priori. We present a methodology that formulates the deep learning problem in the least squares framework of the linear models. As a supervised method, a network is trained to construct an appropriate design matrix, an essential element of the linear model. The entries of this design matrix, as nonlinear functions of the explanatory variables, are trained in an iterative manner using the descent optimization methods. Such a design matrix allows to employ the existing knowledge on the least squares theory to the DL applications. A few examples are presented to demonstrate the theory.

How to cite: Amiri-Simkooei, A.: Least-squares-based formulation of deep learning: Theory and applications to geoscience data analytics, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7272, https://doi.org/10.5194/egusphere-egu22-7272, 2022.

08:39–08:45

EGU22-4039

ECS

On-site presentation

Apply noise filters for better forecast performance in Machine Learning

Nhung Le, Benjamin Männel, Randa Natras, Pierre Sakic, Zhiguo Deng, and Harald Schuh ‬‬‬‬‬‬‬‬‬‬‬‬‬

Abstract:

In Machine Learning (ML), one of the crucial tasks is understanding data characteristics to be able to extract exactly relevant information, while noise contained in data can cause misleading estimations and decrease the generalizability of ML-based prediction models. So far, only few previous studies have applied noise filtering techniques when building forecast models. Hence, their efficiency on ML-based forecasts has not yet been comprehensively demonstrated. Therefore, we aim to determine optimal noise filters to enhance the forecast performance of Total Electron Contents (TEC), crustal motion, and Earth’s polar motion. We investigate six noise filtering algorithms (Moving Mean, Moving Median, Lowess, Loess, and Savitzky Golay) on forecast models to select the best-suited filters. Five ML algorithms are applied to train forecast models, that are Support Vector Machine (SVM), Regression Trees, Linear Regression (LR), Ensembles of Trees, and Gaussian Process Regression (GPR). The findings show that the Savitzky Golay algorithm is the most effective on the ML-based forecast models, followed by Loess and Gaussian filters, while Moving Mean is the least sensitive. Noise filters are more sensitive for forecast models based on SVM and LR than Ensembles of Trees and GPR. Applying the Savitzky Golay filter for SVM and LR optimal models can enhance the prediction accuracy up to 14.0 %, 16.1 % and 89.5 % corresponding to forecasting TEC, crustal motion, and Earth's polar motion, respectively; while that for Ensembles and GPR are only from approximate 3.0 to 27.0 %. Overall, using noise filters is one of the practical solutions to improve prediction performance. They can also be used to smoothen time series with variable characteristics and to generalize high-rate data.

Keywords:

Machine Learning, Noise filters, Savitzky Golay filter, TEC forecast, Crustal motion, Earth’s polar motion.

How to cite: Le, N., Männel, B., Natras, R., Sakic, P., Deng, Z., and Schuh ‬‬‬‬‬‬‬‬‬‬‬‬‬, H.: Apply noise filters for better forecast performance in Machine Learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4039, https://doi.org/10.5194/egusphere-egu22-4039, 2022.

08:45–08:51

EGU22-1101

ECS

On-site presentation

Differential Learning: A method for polar motion time series prediction

Mostafa Kiani Shahvandi, Matthias Schartner, and Benedikt Soja

Nowadays, many applications such as Global Navigation Satellite Systems (GNSS) or spacecraft tracking require a rapid determination, or even predictions, of the Earth Orientation Parameters (EOP). However, due to the measurement techniques utilized to estimate EOP, the latency can be considerably longer than required, which especially hinders real-time applications, resulting in a need for accurate EOP prediction methods.

With the resurgence of machine learning in the last decade, time series prediction is increasingly studied in this context. We propose a learning algorithm for the prediction of polar motion components (xp, yp). The algorithm is based on the concept of Ordinary Differential Equation (ODE) fitting. Within this investigation, a general formula for ODE fitting based on multivariate time series is proposed, with special focus on second order ODEs. The mathematical relations are derived and presented in both linear and non-linear forms, particularly with LSTM and Elman neural networks. In addition, a sensitivity analysis framework is proposed for the linear case, which is used for the determination of the importance of features.

We compared the prediction performance of our method with those from three different studies. First, the conditions of the first Earth Orientation Prediction Comparison Campaign (EOPPCC) are followed. In this case, the ultra-short term predictions (up to 10 days) can be improved on average by 62.5% and 45.6% for xp and yp, respectively, compared to the best performing EOPPCC method. Second, the prediction performance in long-term prediction (up to one year) is compared against Multichannel Singular Spectrum Analysis (MSSA). In this case, the prediction performance is improved on average for xp and yp by 40.9% and 66.4%, respectively. Finally, comparisons against Copula-based methods for long-term prediction are conducted (average improvement 32.3% for xp and 57.8% for yp).

The advantages of this method include (1) exploitation of physical information via Effective Angular Momentum (EAM) functions and by using the concept of ODE fitting, which often corresponds to the laws governing physical phenomena; (2) presence of sensitivity analysis frameworks; and (3) high predictive performance.

How to cite: Kiani Shahvandi, M., Schartner, M., and Soja, B.: Differential Learning: A method for polar motion time series prediction, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1101, https://doi.org/10.5194/egusphere-egu22-1101, 2022.

Machine learning for GNSS applications

08:51–08:57

EGU22-5003

ECS

Virtual presentation

Deep learning for extreme wind speed prediction with CyGNSSnet

Caroline Arnold, Daixin Zhao, Tianqi Xiao, Lichao Mou, and Milad Asgarimehr

The CyGNSS (Cyclone Global Navigation Satellite System) satellite system measures GNSS signals reflected off the Earth’s surface. A global ocean wind speed dataset is derived, which fills a gap in Earth observation data and can improve cyclone forecasting. We proposed CyGNSSnet(1), a deep learning model for predicting wind speed from CyGNSS observables, and found an improved performance of 29% compared to the current operational model. However, the prediction of extreme winds remained challenging: For wind speeds exceeding 12 m/s, the operational model outperformed CyGNSSnet.

Here, we explore methods to improve the performance of CyGNSSnet at high wind speeds. We introduce a hierarchical model that combines specialized CyGNSSnet instances trained in different wind speed regimes with a classifier to select an instance. In addition, we explore strategies to improve the wind speed predictions by emphasizing extreme values in training, and we discuss the potentials and shortcomings of the approaches.

(1) Asgarimehr, M., Arnold, C., Weigel, T., Ruf, C. & Wickert, J. GNSS reflectometry global ocean wind speed using deep learning: Development and assessment of CyGNSSnet. Remote Sensing of Environment 269, 112801 (2022).

How to cite: Arnold, C., Zhao, D., Xiao, T., Mou, L., and Asgarimehr, M.: Deep learning for extreme wind speed prediction with CyGNSSnet, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5003, https://doi.org/10.5194/egusphere-egu22-5003, 2022.

08:57–09:03

EGU22-1503

ECS

On-site presentation

Machine learning based multipath mitigation for high-precision GNSS data processing

Yuanxin Pan, Gregor Möller, Roland Hohensinn, and Benedikt Soja

Multipath is the main unmodeled error source hindering high-precision GNSS (Global Navigation Satellite System) data processing. Classical multipath mitigation methods, such as sidereal filtering (SF) and multipath hemispherical map (MHM), have certain disadvantages: they are either too complicated for implementation or not effective enough for multipath mitigation. In this study, we demonstrate that machine learning (ML) based models, such as random forest, can overcome these drawbacks by spatial interpolation over sky map and thus mitigate multipath effectively. 30 days of 1 Hz geodetic grade GPS data as well as 6 days of low-cost data are used to train and test the ML models. Based on a series of test cases, the best number of days for model training and the validity period for the models are discussed in this contribution. For quantification, the multipath reduction rate and kinematic positioning precision are computed using different ML models and compared to those derived from SF and MHM. The statistical results show that the XGBoost ML model can achieve higher multipath reduction rates compared to SF and MHM, especially for pseudorange measurements, which is important for low-cost devices. It reduces the multipath by 48% and 55% for pseudorange and carrier phase measurements, respectively, and outperforms SF (40% and 52%) and MHM (37% and 49%). The positioning precision when using different multipath models is similar, with differences of less than 1 mm. We conclude that the ML based multipath mitigation method is effective and easy-to-use, which can be applied under real-time scenarios.

How to cite: Pan, Y., Möller, G., Hohensinn, R., and Soja, B.: Machine learning based multipath mitigation for high-precision GNSS data processing, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1503, https://doi.org/10.5194/egusphere-egu22-1503, 2022.

09:03–09:09

EGU22-1834

ECS

On-site presentation

Improving the Accuracy of GNSS Orbit Predictions using Machine Learning Approaches

Junyang Gou, Christine Rösch, Endrit Shehaj, Kangkang Chen, Mostafa Kiani Shahvandi, Benedikt Soja, and Markus Rothacher

Precise orbit determination is vital for the increasingly vast number of space objects around the Earth. Moreover, accurate orbit prediction of GNSS satellites is essential for many real-time geodetic applications, including real-time navigation. The typical way to obtain accurate orbit predictions is using physics-based orbit propagators. However, the prediction errors accumulate with time because of insufficient modeling of the changing perturbing forces. Motivated by the rapid expansion of computing power and the considerable data volume of satellite orbits available in recent years, we can apply machine learning (ML) and deep learning (DL) algorithms to assess if they can be used to further reduce orbit errors.

In this study, we focus on the orbit prediction of GNSS constellations. We investigate the potential of using different ML and DL algorithms for improving the accuracy of the ultra-rapid products from IGS. As ground truth we consider the IGS final products, and the differences between the ultra-rapid and final products are computed and serve as targets for the ML/DL methods. In this context, we combine the advantages of physics-based and data-driven ML/DL methods. Since the major errors of GNSS orbits are expected to be caused by the deficiency of solar radiation pressure models, we consider different related parameters as additional features to implicitly model the solar impact, such as the C_0,0 terms of global ionosphere maps. In order to accurately model the effect of solar radiation pressure on the radial, along-track and cross-track components of the satellite orbit system, the geometric relation between the Sun, the satellite and the Earth are also considered. Furthermore, the performances of different ML/DL algorithms are compared and discussed. Due to the temporal characteristics of the problem, certain sequential modeling algorithms, such as Long Short-Term Memory and Gated Recurrent Unit, show superiority. Our approach shows promising results with average improvements of over 40% in 3D RMS within the 24-hours prediction interval of the ultra-rapid products.

How to cite: Gou, J., Rösch, C., Shehaj, E., Chen, K., Kiani Shahvandi, M., Soja, B., and Rothacher, M.: Improving the Accuracy of GNSS Orbit Predictions using Machine Learning Approaches, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-1834, https://doi.org/10.5194/egusphere-egu22-1834, 2022.

09:09–09:15

EGU22-12032

ECS

On-site presentation

Towards the characterization of Slow Slip deformation by means of deep learning

Giuseppe Costantino, Sophie Giffard-Roisin, Mauro Dalla Mura, David Marsan, Mathilde Radiguet, and Anne Socquet

Detecting small Slow Slip Events (SSEs) is still an open challenge. The difficulty in revealing low magnitude events is related to their detection in the geodetic data, which must be improved either by employing more powerful equipment or by developing novel methods for the systematic discovery of small events, which can be crucial for the precise characterization of the slip spectrum. The improvement of the ability to detect small SSEs and the associated seismic response can play a decisive role in the understanding of the mechanics of active faults, remarkably subduction in which tremors cannot serve as a proxy for the slow slip or Episodic Tremor and Slip (ETS) is not regularly observed, making it necessary to provide new observations and methods to perceive potential bursts of slow slip.

Here we explore three Deep Learning–based strategies applied to GNSS data to characterize earthquakes and SSEs. Unlike seismic data, geodetic observations are crucial for dealing with SSEs, since they contain the required spatiotemporal information. Yet, since the low number of available labelled events (earthquakes or SSEs) producing significant displacement at GNSS station does not allow to adequately train Deep Learning models, we adopt synthetic geodetic data (Okada, 1985), obtained by generating events with uniformly distributed parameters. Thus, the model will not be biased towards the most numerous parameters, with a possibly stronger predictive power. The approach inspired by (van den Ende, Ampuero, 2020) was used for the characterization (i.e., estimation of epicentral location and magnitude), which associates geodetic time series with the location information of the GNSS stations. Yet, rearranging the geodetic displacement from GNSS time series into images can let Convolutional Neural Networks (CNN) to better account for the data spatial consistency, leading to more precise results. Furthermore, Transformers have also been tested with image time series of ground deformation. To assess the reliability of the tested methods, a magnitude threshold on the synthetic test set has been estimated, which depends on the depth and the hypocenter location of the event, showing a trade-off between the Signal-to-Noise (SNR) ratio and the relative position of the test events with respect to the GNSS network, revealing physical consistence. The results are also spatially consistent, as the location and magnitude errors tend to increase as the actual epicenters move offshore, with the location error showing a strong inverse proportionality on the magnitude. The employment of time series of deformation with Transformer networks lead to the best results and may allow us to better handle the noise complexity and to account for a spatio–temporal analysis of the ground deformation linked to SSE triggering. Nevertheless, the image–based model outperforms the other two on real data, showing evidence that the synthetic data does still not overlap with the real one, opening towards several perspectives. A more complex synthetic noise can be produced by allowing for synthetic data gaps and outliers (e.g., common modes), or machine learning–based denoising strategies can be envisioned to pre–process the data to improve the SNR ratio.

How to cite: Costantino, G., Giffard-Roisin, S., Dalla Mura, M., Marsan, D., Radiguet, M., and Socquet, A.: Towards the characterization of Slow Slip deformation by means of deep learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-12032, https://doi.org/10.5194/egusphere-egu22-12032, 2022.

09:15–09:21

EGU22-9105

ECS

On-site presentation

Modeling of Residual GNSS Station Motions through Meteorological Data in a Machine Learning Approach

Pia Ruttner, Roland Hohensinn, Stefano D'Aronco, Jan Dirk Wegner, and Benedikt Soja

Global Navigation Satellite System (GNSS) long-term residual height time series exhibit signals related to environmental influences. These can partly b explained through environmental surface loads, which are described with physical models. In this work, a model is computed to connect the GNSS residuals with raw meteorological parameters. A Temporal Convolutional Network (TCN) is trained on 206 GNSS stations in central Europe, and applied to 68 test stations in the same area. The resulting Root Mean Square (RMS) error reduction is on average 0.8% lower for the TCN modeled time series, compared to using physical models for the reduction. In a further experiment, the TCN is trained on the GNSS time series after reducing those by the surface loading models. The aim is a further increase of RMS reduction, which is achieved with 2.7% on average, resulting in an overall mean reduction of 28.6%. The results suggest that with meteorological features as input data, TCN modeled reductions are able to compete with reductions derived from physical models. Trained on the residuals reduced by environmental loading models, the TCN is able to slightly increase the overall reduction of variations in the GNSS station position time series.

How to cite: Ruttner, P., Hohensinn, R., D'Aronco, S., Wegner, J. D., and Soja, B.: Modeling of Residual GNSS Station Motions through Meteorological Data in a Machine Learning Approach, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9105, https://doi.org/10.5194/egusphere-egu22-9105, 2022.

09:21–09:27

EGU22-405

ECS

Virtual presentation

Ship-based GNSS ionospheric observations for the detection of tsunamis with deep learning

Yuke Xie, James Foster, Michela Ravanelli, and Mattia Crespi

Tsunami detection and forecasting require observations from open-ocean sensors. It is well known that tsunamis can generate internal gravity waves that propagate through the ionosphere from the earthquake center along with the tsunami wave. These disturbances can be detected by Global Navigation Satellite Systems (GNSS) receivers. The VARION (Variometric Approach for Real-Time Ionosphere Observation) algorithm has been successfully applied to detecting traveling ionospheric perturbations (TIDs) in several real-time scenarios, and it has also been successfully demonstrated that this algorithm is suitable for moving systems such as ship-based GNSS receivers. We present analyses of GNSS data collected from ships and examine the potential of a ship-based GNSS network for the ionospheric detection of tsunamis.

In this project, we focused on the detection of tsunami signals from the TIDs using deep learning methods. Benefiting from the large amount of data from widely distributed GNSS permanent stations, we developed a prototype convolutional neural network for tsunami detection, achieving highly accurate prediction scores on the validation and test data. We used the observations coming from our 10-ship pilot network real-time GNSS system from the Pacific ocean to detect the TIDs related to the 2015 Illapel, Chile earthquake and tsunami. Using our algorithm in a post-processing mode we found that our ships successfully detected the ionospheric tsunami signal even though there was no detectable sea-surface height perturbation for the ship. Comparing the performance using our deep learning method with other anomaly detection approaches in a real-time scenario, we found that our approach works very efficiently with the pre-trained model. The results of our study, although preliminary, are very encouraging and we conclude that ships can be cost-effective real-time tsunami early-warning sensors. Given that there are thousands of existing ships in the Pacific Ocean, this is a promising opportunity to improve hazard mitigation.

How to cite: Xie, Y., Foster, J., Ravanelli, M., and Crespi, M.: Ship-based GNSS ionospheric observations for the detection of tsunamis with deep learning, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-405, https://doi.org/10.5194/egusphere-egu22-405, 2022.

Machine learning for atmospheric modeling and forecasting

09:27–09:33

EGU22-5408

ECS

On-site presentation

Machine Learning Approach for Forecasting Space Weather Effects in the Ionosphere with Uncertainty Quantification

Randa Natras, Benedikt Soja, Michael Schmidt, Marie Dominique, and Ayşe Türkmen

Space weather can cause strong sudden disturbances in the Earth’s ionosphere that can degrade the performance and reliability of Global Navigation Satellite System (GNSS) operations. To minimize such degradations, ionospheric effects need to be precisely and timely corrected by providing information of the spatially and temporally variable Total Electron Content (TEC). To obtain such corrections and early warning information of space weather events, we need to model the nonlinear space weather processes focusing on their impact on the ionosphere. Machine Learning (ML) models can learn nonlinear relationships from data to solve complex phenomena such as space weather. To interpret ML model results, it is crucial to know their quality and reliability. Quantifying the uncertainty of the ML results is an important step toward developing a “trustworthy” model, providing reliable results, and improving the model explainability.

This study presents a novel ML model to forecast the vertical TEC (VTEC) utilizing state-of-the-art supervised learning techniques and robustly assessing the uncertainty of the achieved results. The data are systematically analyzed, selected and pre-processed for optimal model learning, especially during space weather events. Results from our previous study (Natras and Schmidt, 2021) were improved in terms of data, ensemble modelling, and uncertainty quantification. The input data are expanded with additional parameters of the solar wind and the interplanetary magnetic field from OmniWeb and spectral irradiance measurements from the solar instrument LYRA onboard the spacecraft PROBA2 (Dominique et al., 2013). Also, new input features have been derived, such as daily differences, time derivatives, moving averages, etc. We applied ensemble modeling to combine diverse ML models based on different learning algorithms with different training data sets. The ensemble model enhances the performance of base learners and quantifies the uncertainty of results. This approach shows potential for forecasting VTEC in different ionospheric regions during quiet and storm periods, while providing the uncertainties of the forecasting results.

Keywords: Machine Learning, Space Weather, Ionosphere, Vertical Total Electron Content (VTEC), Forecasting, Uncertainty Quantification

References:

Dominique, M., Hochedez, JF., Schmutz, W. et al. (2013): The LYRA Instrument Onboard PROBA2: Description and In-Flight Performance. Sol Phys 286, 21-42 https://doi.org/10.1007/s11207-013-0252-5

Natras, R., Schmidt, M. (2021): Ionospheric VTEC Forecasting using Machine Learning, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-8907, https://doi.org/10.5194/egusphere-egu21-8907

How to cite: Natras, R., Soja, B., Schmidt, M., Dominique, M., and Türkmen, A.: Machine Learning Approach for Forecasting Space Weather Effects in the Ionosphere with Uncertainty Quantification, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5408, https://doi.org/10.5194/egusphere-egu22-5408, 2022.

09:33–09:39

EGU22-7331

ECS

Presentation form not yet defined

Spatio-temporal Graph Neural Networks for Ionospheric TEC Prediction Using Global Navigation Satellite System Observables

Maria Kaselimi, Vassilis Gikas, Nikolaos Doulamis, Anastasios Doulamis, and Demitris Delikaraoglou

Precise modeling of the ionospheric Total Electron Content (TEC) is critical for reliable and accurate GNSS applications. TEC is the integral of the location-dependent electron density along the signal path and is a crucial parameter that is often used to describe ionospheric variability, as it is strongly affected by solar activity. TEC is highly depended on local time (temporal variability), latitude, longitude (spatial variability), solar and geomagnetic conditions. The propagation of the signals from GNSS (Global Navigation Satellite System) satellites throughout the ionosphere is strongly influenced by temporal changes and ionospheric regular or irregular variations. Here, we propose a deep learning architecture for the prediction of the vertical total electron content (VTEC) of the ionosphere based on GNSS data.

The data used in many deep learning tasks until recently where mostly represented in the Euclidean space. However, geodesy studies data that have an underlying structure that is non-Euclidean space. Geospatial data are large and complex, as in the case of GNSS networks data, and their non- Euclidean nature has imposed significant challenges on the existing machine learning algorithms. The task of VTEC prediction is challenging mainly due to the complex spatiotemporal dependencies and an inherent difficulty in temporal forecasting. Spatial-temporal graph neural networks (STGNNs) aim to learn hidden patterns from spatial-temporal graphs. The key idea of STGNNs is to consider spatial and temporal dependency at the same time. Spatial Dependency: Assuming a network of permanent stations of International GNSS Service (IGS), each station represents a node of the graph, and their Euclidean distance is used to formulate the set of edges of the graph. Thus, we achieve exchange between nodes and their neighbors. Temporal dependency: The graph operates in a dynamic environment. Thus, we leverage the recurrent neural networks (RNNs) to model the temporal dependency. As a result, time series of VTEC data can be predicted to future epochs. Solar and geomagnetic indices are formulated as node attributes and are also present temporal variability.

Topics to be discussed in the study include the design of the graph neural network structure, the training methods exploiting steepest descent algorithms, data analysis, as well as preliminary testing results of the VTEC predictions as compared, with state-of-the-art graph architectures.

How to cite: Kaselimi, M., Gikas, V., Doulamis, N., Doulamis, A., and Delikaraoglou, D.: Spatio-temporal Graph Neural Networks for Ionospheric TEC Prediction Using Global Navigation Satellite System Observables, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-7331, https://doi.org/10.5194/egusphere-egu22-7331, 2022.

09:39–09:45

EGU22-2702

ECS

Presentation form not yet defined

Development of a global model for zenith wet delays based on the random forest approach

Qinzheng Li, Johannes Böhm, Linguo Yuan, and Robert Weber

Tropospheric delays have been a major error source for space geodetic techniques and the performance of their modeling is significantly limited due to the high spatiotemporal variability of the moisture in the lower atmosphere. In this study, tropospheric zenith wet delay (ZWD) modeling was realized based on the machine learning (random forest approach, RF) and using 10 years (2010-2019) of radiosonde measurements at 586 globally distributed stations. Subsequently, the ZWD modeling accuracy was validated based on the sounding profiles across the globe for the year 2020. We find that ZWD modeling accuracy is significantly improved by taking account meteorological parameters in the functional formulation, especially for surface water vapor pressure. When surface meteorological data are available, the RF-based ZWD models with meteorological parameterization can achieve an overall accuracy of 2.9 cm and the bias close to zero across the globe, which clearly outperforms current empirical models, such as the GPT3, or other models based on surface meteorological measurements. From the analyses of spatial characteristics of the ZWD accuracy, it can be concluded that the RF-based ZWD models especially mitigate the systematic biases in the regions with monsoon climate and tropical rainforest climate types.

How to cite: Li, Q., Böhm, J., Yuan, L., and Weber, R.: Development of a global model for zenith wet delays based on the random forest approach, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2702, https://doi.org/10.5194/egusphere-egu22-2702, 2022.

09:45–09:51

EGU22-4531

ECS

On-site presentation

Machine learning and meteorological data for spatio-temporal prediction of tropospheric parameters

Laura Crocetti, Benedikt Soja, Grzegorz Kłopotek, Mudathir Awadaljeed, Markus Rothacher, Linda See, Rudi Weinacker, Tobias Sturn, Ian McCallum, and Vicente Navarro

Radio signals transmitted by Global Navigation Satellite System (GNSS) satellites propagate through the atmosphere before being received on Earth. Thereby, the signal is delayed and tropospheric parameters can be estimated. The good global coverage of GNSS receivers, combined with the high temporal resolution and the high accuracy, make GNSS a suitable tool for studies on the atmosphere.

Atmospheric delays are differentiated into a zenith hydrostatic (ZHD) and a non-hydrostatic, or zenith wet delay (ZWD). The hydrostatic part has a larger contribution (causing a delay of roughly 2.4 meters in the zenith direction) but can be modeled with sufficient accuracy using analytical methods. The ZWD has a smaller contribution (causing a delay between 0 to 40 centimeters) and depends mainly on the water vapour content in the atmosphere. However, due to the variable nature of water vapour, the ZWD is difficult to model and is therefore typically estimated. Its quantification is essential since it drives weather systems and climate change to a great extent. For many applications, such as weather forecasting or positioning using low-cost GNSS receivers such as smartphones, global real-time monitoring or even predictions of ZWD would be required and be beneficial.

In the last decade, machine learning (ML) algorithms have gained a lot of interest and are successfully utilized in many different fields. Thereby, ML algorithms have proven to be able to efficiently process and combine large amounts of data and solve problems of various kinds.

This motivated us to investigate the feasibility of ML algorithms for the prediction of tropospheric parameters, in particular ZWD, with the help of meteorological data such as the water vapour content. The work aims to develop a global model capable of predicting ZWD in space and time. Therefore, different ML algorithms are used to train a model based on meteorological features. The performance of the utilized algorithms is evaluated based on commonly used performance metrics, such as Root Mean Squared Error (RMSE) and R².

Preliminary investigations are carried out utilizing 3000 GNSS stations distributed over Europe. The performance of various ML methods, such as Linear Regression methods, Random Forest, (Extreme) Gradient Boosting, and Multilayer Perceptron is compared. Furthermore, different feature combinations, as well as training sample sizes are investigated. It is revealed that linear methods are not able to properly reflect the observations. Instead, our Random Forest approach provides, so far, the highest model accuracy with an RMSE of 1.7 centimeters and an R² value of 0.88.

How to cite: Crocetti, L., Soja, B., Kłopotek, G., Awadaljeed, M., Rothacher, M., See, L., Weinacker, R., Sturn, T., McCallum, I., and Navarro, V.: Machine learning and meteorological data for spatio-temporal prediction of tropospheric parameters, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4531, https://doi.org/10.5194/egusphere-egu22-4531, 2022.

09:51–10:00

General discussion