- 1Mines Paris - PSL, Centre de Géosciences, France (guillaume.metayer10@gmail.com)
- 2Mines Paris - PSL, Centre de Mathématiques Appliquées, Sophia Antipolis, France
- 3Mines Paris - PSL, Institut des Transformations Numériques, Paris, France
- 4Sorbonne Université, Milieux Environnementaux, Transferts et Interactions dans les hydrosystèmes et les Sols, METIS, Paris, France
Long-term hydrological time series are essential for planning effective water-resource management strategies that balance competing water and energy uses and preserve ecosystem functioning. In particular, long-term large-scale surface water temperature (SWT) time series are crucial for enhancing understanding of climate change impacts and for quantifying uncertainties in the occurrence of critical periods affecting water and energy uses, as well as ecosystem balance. However, these datasets inevitably contain missing observations, and long-term data series with large spatial coverage remain scarce. Modeling approaches provide valuable tools for estimating surface water temperature dynamics when observations are missing. Owing to their low data requirements and fast computation times, statistically based approaches are well suited to large spatial scales, where physically based approaches often become impractical to apply. Among statistically based methods, recurrent neural networks, such as Long Short-Term Memory (LSTM) models, have recently shown considerable potential for time series imputation (Cao et al., 2018 https://doi.org/10.48550/arXiv.1805.10572; Che et al., 2018 https://doi.org/10.1038/s41598-018-24271-9) and for simulating hydrological variables, including SWT (e.g. Saadi et al., 2025 https://doi.org/10.5194/egusphere-2025-3393). The aim of the present work was to develop and assess an approach for reconstructing long-term SWT time series at the scale of a large river basin using an LSTM model. The study was conducted at the scale of the Seine River Basin, including nearly 80 monitoring stations providing daily SWT observations, and relied on continuous meteorological data from 1958 to 2025 derived from the SAFRAN system (Vidal et al., 2010 10.1002/joc.2003). The developed model was designed to simulate a one-year daily SWT sequence, considering both dynamic and static inputs. Dynamic inputs include one-year sequences of meteorological data and the daily SWT time series to be reconstructed, as well as masks used to identify missing values in the SWT input (Quian et al., 2024 arXiv:2405.17508v1). Static inputs include features characterizing the monitoring stations, such as hydrological (mean and low-flow discharges), geographical and meteorological features. The model architecture is composed of two sequential modules: (i) a bidirectional LSTM that encodes basin-scale temporal dynamics from dynamic inputs, and (ii) a multilayer perceptron that combines the LSTM’s final hidden states with a learned embedding representing the target monitoring station to generate the full annual SWT sequence. This approach enables the reconstruction of daily SWT across the basin over multiple decades, handling a wide range of missing-data situations - from sporadic gaps to entirely missing time series - by leveraging covariates and influential drivers, primarily meteorological factors.
How to cite: Metayer, G., Rivière, A., Corral, D., Roy, V., and Thomas, W.: Reconstructing multi-decadal daily river water temperature in the Seine RiverBasin (France) with a bidirectional LSTM and basin-location embeddings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20702, https://doi.org/10.5194/egusphere-egu26-20702, 2026.