EGU25-8526, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-8526
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Homogenization of GNSS integrated water vapour time series using statistical machine learning
Emilie Lebarbier1, Ninh Nguyen2, and Olivier Bock2
Emilie Lebarbier et al.
  • 1Université, Paris Nanterre, MODAL'X, Nanterre, France (lebarbie@parisnanterre.fr)
  • 2Université Paris Cité, Institut de physique du globe de Paris, CNRS, IGN, F-75005 Paris, France / Univ Gustave Eiffel, ENSG, IGN, F-77455 Marne-la-Vallée, France

We present a novel approach to homogenize daily GNSS water vapour time series using statistical methods and machine learning techniques. The procedure involves three main steps:

  • Segmentation. The aim is to detect the number and position of change-points in a time series of Integrated Water Vapour (IWV) differences (GNSS minus reference), modelled as a constant (mean) value per segment superposed with a fourth order Fourier series and white noise with a monthly varying variance. The model parameters are estimated by penalized maximum likelihood algorithm, implementing Dynamic Programming search in an iterative scheme [1].
  • Attribution. The aim is to predict, for each and every change-point, in which of the GNSS (G) or reference (E) series the jump in mean occurred. Information from nearby stations is introduced as additional G' and E' series, which are combined with G and E into six series of differences. A Feasible Generalized Least Squares regression is used to estimate the size of the jumps in the six series and a Random Forest classifier is used to predict which of the four base series caused the jump. The classifier is trained and validated beforehand with a large data set by using a resampling strategy [2].
  • Correction. The raw G and E series are corrected for the corresponding shifts in mean that were detected and attributed to G and/or E.

The paper will present recent improvements of the attribution method, namely: i) the optimization of detection skill scores, both for the training of the classifier and application; ii) the optimization of the sample size for the resampling; iii) a refined nearby-aggregation method based on inverse distance weighting. The method is applied to a new, enhanced, data set based on more than 6000 globally-distributed GNSS stations. The impact of homogenization on IWV trends over the period 1994-2022 is presented.

[1] Quarello et al., 2022, https://doi.org/10.3390/rs14143379

[2] Nguyen et al., 2024, https://doi.org/10.1002/joc.8441

How to cite: Lebarbier, E., Nguyen, N., and Bock, O.: Homogenization of GNSS integrated water vapour time series using statistical machine learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-8526, https://doi.org/10.5194/egusphere-egu25-8526, 2025.