EGU25-8526, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-8526
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 28 Apr, 14:25–14:35 (CEST)
 
Room 0.49/50
Homogenization of GNSS integrated water vapour time series using statistical machine learning
Emilie Lebarbier1, Ninh Nguyen2, and Olivier Bock2
Emilie Lebarbier et al.
  • 1Université, Paris Nanterre, MODAL'X, Nanterre, France (lebarbie@parisnanterre.fr)
  • 2Université Paris Cité, Institut de physique du globe de Paris, CNRS, IGN, F-75005 Paris, France / Univ Gustave Eiffel, ENSG, IGN, F-77455 Marne-la-Vallée, France

We present a novel approach to homogenize daily GNSS water vapour time series using statistical methods and machine learning techniques. The procedure involves three main steps:

  • Segmentation. The aim is to detect the number and position of change-points in a time series of Integrated Water Vapour (IWV) differences (GNSS minus reference), modelled as a constant (mean) value per segment superposed with a fourth order Fourier series and white noise with a monthly varying variance. The model parameters are estimated by penalized maximum likelihood algorithm, implementing Dynamic Programming search in an iterative scheme [1].
  • Attribution. The aim is to predict, for each and every change-point, in which of the GNSS (G) or reference (E) series the jump in mean occurred. Information from nearby stations is introduced as additional G' and E' series, which are combined with G and E into six series of differences. A Feasible Generalized Least Squares regression is used to estimate the size of the jumps in the six series and a Random Forest classifier is used to predict which of the four base series caused the jump. The classifier is trained and validated beforehand with a large data set by using a resampling strategy [2].
  • Correction. The raw G and E series are corrected for the corresponding shifts in mean that were detected and attributed to G and/or E.

The paper will present recent improvements of the attribution method, namely: i) the optimization of detection skill scores, both for the training of the classifier and application; ii) the optimization of the sample size for the resampling; iii) a refined nearby-aggregation method based on inverse distance weighting. The method is applied to a new, enhanced, data set based on more than 6000 globally-distributed GNSS stations. The impact of homogenization on IWV trends over the period 1994-2022 is presented.

[1] Quarello et al., 2022, https://doi.org/10.3390/rs14143379

[2] Nguyen et al., 2024, https://doi.org/10.1002/joc.8441

How to cite: Lebarbier, E., Nguyen, N., and Bock, O.: Homogenization of GNSS integrated water vapour time series using statistical machine learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-8526, https://doi.org/10.5194/egusphere-egu25-8526, 2025.