Use of Long-Short Term Memory network (LSTM) in the reconstruction of missing water level data in the Seine River.
- Univ Rouen Normandie, UNICAEN, CNRS, M2C UMR 6143, F-76000 Rouen, France
Missing data is the first major problem that appears in many database fields for a set of reasons. It has always been necessary to fill them, which becomes unavoidable and more complicated when the missing periods are longer. Several machine-learning-based approaches have been introduced to deal with this problem.
The purpose of this paper is to discuss the effectiveness of a new methodology added prior to the LSTM deep learning algorithm to fill in the missing data in the hourly surface water level time series of some stations installed along the Seine River in Normandy-France. In our study, due to a lack of data, a challenging situation was faced where only the water level data in the same station, which contain many missing parts, were used as input and output variables to fill the station itself in a self-learning approach. This contrasts with the common work on imputing missing data, where several features are available to take advantage of in a multivariate and spatiotemporally way, e.g.: using the same variable from other stations or exploiting other physical variables and metrological data, etc. The reconstruction accuracy of the proposed method depends on both the size of the available/missing data and the parameters of the networks. Therefore, we performed sensitivity analyses on both the properties of the networks and the structuring of the input and output data to better determine the appropriate strategy. During this analysis process, a data preprocessing method was developed and added prior to the LSTM model. This data processing method was discovered by presenting many scenarios, each of which was an updated version of the last one. Along with these scenarios, limitations were also addressed and overcome. Finally, the last model version was able to impute missing values that may reach one year of hourly data with high accuracy (One-year RMSE = 0.14 m) regardless of neither the location of the missing part in the series nor its size.
How to cite: Janbain, I., Deloffre, J., Jardani, A., Vu, M. T., and Massei, N.: Use of Long-Short Term Memory network (LSTM) in the reconstruction of missing water level data in the Seine River. , EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-4970, https://doi.org/10.5194/egusphere-egu23-4970, 2023.