- 1Izmir Institute of Technology, Izmir Institute of Technology, International Water Resources, Türkiye (elnazbayatkhajeh@gmail.com)(elnazbayat@iyte.edu.tr)
- 2Department of Water Engineering, University of Tabriz, Tabriz, Iran(s.samadian@tabrizu.ac.ir)
- 3Water Sciences and Hydroinformatics Research Center, Khazar University, Mahsati str. 41, AZ 1096, Baku, Azerbaijan ( ssamadianfard@khazar.org)
- 4Department of Environmental Engineering, Izmir Institute of Technology, Izmir, Türkiy(saeedsamadianfard@iyte.edu.tr) (orhangunduz@iyte.edu.tr)
The correct estimation of groundwater levels (GWLs) is important for the sustainable management of water resources, especially in arid or semi-arid regions like Tabriz plain aquifer located in northwest of Iran, where the need for freshwater is increasing and climate variability puts more stress on the aquifer. Thus, this study introduces a novel framework that combines time-series modeling with deep learning, along with clustering, to improve GWLs estimation in heterogeneous aquifers.
All monitoring wells within the aquifer were classified into five clusters using the k-means method to deal with spatial heterogeneity. The basis for clustering included two standards: (i) characteristics of groundwater behavior and (ii) characteristics of hydro-environmental variables associated with each cluster. The results of the clustering were used to develop models of GWLs for each cluster, thereby minimizing spatial variability and increasing the predictive capability of the models. The initial base model is the Long Short-Term Memory (LSTM) model, which is combined with a Double Moving Average (DMA) technique to improve model performance. Therefore, a DMA-LSTM hybrid model is developed to combine temporal smoothing with deep learning methodology.
Model inputs include precipitation, temperature, normalized difference vegetation index, groundwater extraction, and previous lagged GWLs data during 2011-2024 time period. The input of climatic, vegetation, anthropogenic, and groundwater lagged data into the model enabled it to show both natural aquifer recharge processes and the impacts of human activities simultaneously. Monthly GWLs were considered as the Target Output of all 5 Clusters.
The model evaluation across all clusters indicates that both LSTM and DMA-LSTM models are able to predict GWLs with high accuracy, as indicated by Coefficients of Determination (R²) values greater than 0.97 for Clusters 1 and 2. The combination of the DMA and LSTM showed improvements in prediction accuracy based on a smaller Root Mean Square Error (RMSE) for all clusters such that RMSE reduced from 0.0396 to 0.0303 in cluster 1 and from 0.0988 to 0.0659 in cluster 5. Additionally, Clusters exhibiting a higher degree of variability (i.e., Cluster 3 & 5) demonstrated a significant reduction in the higher temporal fluctuation with RMSE reductions greater than 30% (from 0.1249 to 0.0853 in Cluster 3 and from 0.0988 to 0.0659 in Cluster 5), indicating the advantage of combining DMA with deep learning for GWLs prediction in more variable clusters.
Accordingly, the results revealed that combining LSTM and DMA improves the predictive performance of the LSTM while preserving the capabilities of deep learning models. The developed model is a strong and efficient method for groundwater monitoring and management, as it can be applied to regions experiencing similar climatic, hydrological, and anthropogenic pressures.
Keywords: Groundwater level prediction, Hybrid deep learning, LSTM, DMA, K-means clustering
How to cite: Bayat Khajeh, E., Samadianfard, S., and Gündüz, O.: Improving Groundwater Level Prediction Using a Cluster-Based Hybrid LSTM Approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9218, https://doi.org/10.5194/egusphere-egu26-9218, 2026.