HS3.4 | Advanced geostatistics, clustering and classification for water, earth and environmental sciences
Advanced geostatistics, clustering and classification for water, earth and environmental sciences
Co-sponsored by IAHS-ICSH
Convener: Svenja FischerECSECS | Co-conveners: Nilay Dogulu, Vanessa A. GodoyECSECS, Jaime Gómez-Hernández, Gerard Heuvelink, Alessandra Menafoglio, Georgia PapacharalampousECSECS
| Wed, 26 Apr, 08:30–10:15 (CEST)
Room 2.15
Posters on site
| Attendance Wed, 26 Apr, 10:45–12:30 (CEST)
Hall A
Posters virtual
| Attendance Wed, 26 Apr, 10:45–12:30 (CEST)
vHall HS
Orals |
Wed, 08:30
Wed, 10:45
Wed, 10:45
The ever-increasing amount of data available in the water, earth and environmental sciences requires new approaches and more advanced methods that can quantify and measure the relationships in these data sets but also their uncertainty. Remote sensing, improved and cheaper measurement technology and global databases have steadily improved our information on processes, but require an understanding of the interplay of these data and their dependence.
Clustering and classification algorithms are increasingly and extensively applied in hydrology as the need for pattern recognition and data mining tasks persists with higher availability of large multivariate datasets. While both approaches share the goal of dividing data into convenient groups, classification approaches pre-define such groups (i.e. supervised learning) whereas clustering approaches group data with similar properties without preconceived notions about which groups are expected to be in the data (i.e. unsupervised learning).
Geostatistical methods are commonly applied in the water, earth and environmental sciences to quantify spatial variation, produce interpolated maps with quantified uncertainty and optimize spatial sampling designs. Space-time geostatistics explores the dynamic aspects of environmental processes and characterise the dynamic variation in terms of correlations. Geostatistics can also be combined with machine learning and mechanistic models to improve the modelling of real-world processes and patterns. Such methods are used to model soil properties, produce climate model outputs, simulate hydrological processes, and to better understand and predict uncertainties overall.
Topics covered in this session are:
1) How can clustering/classification approaches increase our understanding and improve our prediction of hydrological processes?
2) To what extent should clustering/classification algorithm settings be finetuned for hydrological applications?
3) How can geostatistical approaches be used for the characterization of uncertainties and error propagation?
4) How can spatial and temporal aspects be combined in geostatistics and how do they improve our understanding of hydrological processes?
5) What is the benefit of integrating machine-learning approaches to geostatistics?

Orals: Wed, 26 Apr | Room 2.15

Chairpersons: Georgia Papacharalampous, Svenja Fischer, Gerard Heuvelink
Part I: Geostatistics
On-site presentation
Shiran Levy, Lea Friedli, Grégoire Mariéthoz, and Niklas Linde

We seek to develop a methodology enabling fast geostatistical simulations honoring both geophysical data and a complex prior model. Particularly, we consider a multiple-point statistics (MPS) framework in which a training image (TI) describes the available prior knowledge. Accurate posterior sampling is then possible by using a so-called extended Metropolis algorithm in which proposals are drawn from the prior using sequential geostatistical resampling. Such a Markov chain Monte Carlo (MCMC) algorithm will eventually locate and sample proportionally to the posterior distribution, however, it is often exceedingly slow and typically demands millions of MCMC iterations before the posterior is sampled sufficiently. We are developing a methodology in which the MPS simulation is built up iteratively pixel-by-pixel starting from an empty grid. At each pixel, multiple proposals are generated using an MPS algorithm and the proposals are accepted proportionally to the likelihood considering conditioning data in terms of linear averages (for instance geophysical data). The likelihood function is generally intractable as it depends on the pixels that have not yet been sampled. We approximate the likelihood function using a Gaussian model in which the posterior mean and covariance are updated sequentially as the simulation builds up. The posterior statistics are approximated by running the algorithm multiple times (sequentially or in parallel). Considering crosshole first-arrival ground-penetrating radar data, we assess the accuracy of our methodology both for multi-Gaussian priors for which analytical posteriors are available and for more complex training images against the extended Metropolis method. Our approach is inherently approximate due to the use of a finite training image, a finite number of candidates for each pixel and the need to approximate intractable likelihood functions. Nevertheless, preliminary results are promising as this method allows directly obtaining a reasonable estimation at a reduced computational cost compared to MCMC.

How to cite: Levy, S., Friedli, L., Mariéthoz, G., and Linde, N.: Sequential multiple-point statistics simulations conditioned on arithmetic averages, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16088, https://doi.org/10.5194/egusphere-egu23-16088, 2023.

On-site presentation
Tomy-Minh Truong, Alberto Guadagnini, and Irina Engelhardt

Characterization of the spatial distribution of geomaterials and of the associated attributes is a key step associated with the set up of a hydrogeological model. Geological information are often used as a basis for this purpose. One of the most common sources of geological information is provided by available borehole data. However, geological and hydraulic information are often available at different scale. In most cases hydraulic parameters are only measured at point locations, e.g. based on pumping test, which cannot be directly transferred into 3D large-scale parameter fields. However, in some regions even geological information are scare. In such situations, information about aquifer facies and material groups need to be interpolated and serve then as a base to derive key hydraulic parameters, such as hydraulic conductivity, or transport parameters, such as porosity, diversity or reactive surfaces. Sedimentary descriptions are usually achieved when drilling a borehole. Classification of sediments rests on a well defined procedure and provides a preliminary assessment on particle size distributions of the samples analyzed. Based on sedimentary descriptions of the boreholes we construct synthetic particle size distribution curves. These particle size distribution curves can be used to calculate major local attributes of the system (e.g., hydraulic and some specific transport parameters). Based on these types of readily-available information this study aims at developing a procedure to assist construction of a high resolution geological model suitable to be transferred into a flow and transport model that is then used for water resources management issues. We therefore aim to estimate storage and transmissivity with a high reliability by accounting for the material composition in the interpolated space. We rely on a compositional data analysis framework and represent particle size fractions associated with a given location as a compositional vector. These vectors are then projected onto a computational grid through compositional kriging to characterize the spatial heterogeneity of the system. We compare these results against an approach that is based on clustering the ensuing information to obtain distinct geomaterial classes and then assess their spatial distribution through indicator kriging. After the 3D field of grain size distribution curves is generated, they are transferred into hydraulic parameter. Although the process of clustering and using material classes is inevitably associated with a loss in information the procedure of forming a representative particle size distribution around the compositional clusters attempts to keep this loss of information at a minimum. The benefit of interpolating the compositional data instead of directly interpolating inferred parameters is that the particle size distribution curves contain a huge set of information from hydraulic to transport and reactive parameters, which would be lost using hydraulic conductivity exclusively, while the use of material classes increases the efficiency of the calibration of the groundwater model.

How to cite: Truong, T.-M., Guadagnini, A., and Engelhardt, I.: Characterization of spatial heterogeneity of geomaterials in large scale groundwater bodies through a compositional data approach, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7160, https://doi.org/10.5194/egusphere-egu23-7160, 2023.

Virtual presentation
Monica Palma, Sabrina Maggio, Claudia Cappello, Antonella Congedi, and Sandra De Iaco

Groundwater over-exploitation and environment pollution, together with rising temperatures and other climate changes, can cause a large imbalance in the soil physicochemical properties, with a negative impact on economic, social and human health conditions. Therefore, monitoring and assessing the evolution in space and time of groundwater qualitative parameters as well as quantitative status are crucial aspects for a sustainable water management.

Multivariate Geostatistics foresees dedicated tools for analyzing multivariate spatio-temporal data which are characterized by heterogeneous patterns in space-time, such as those concerning hydrogeological data. In the literature, few analyses (Jang et al., 2012; Yazdanpanah, 2016; Mastrocicco et al., 2021) have been developed on the main groundwater qualitative indicators through the use of spatio-temporal multivariate geostatistical methodologies.

This paper aims to propose a spatio-temporal multivariate analysis for some benchmark indicators describing the qualitative and quantitative status of an unconfined aquifer in Italy. By applying the fitting procedure proposed in De Iaco et al. (2019) and recalled in Cappello et al. (2022), a spatio-temporal multivariate correlation model is developed for forecasting purposes. Then, on the basis of a comparison among predicted values of the variables under study and values recorded for the same variables a decade before, hazard maps of groundwater degradation are produced by through a non-parametric approach, identifying those vulnerability areas where the aquifer system could be contamined. The empirical findings will help the policy makers to pursue effective actions aimed at safeguarding groundwater resources.



- Cappello, C., De Iaco, S., Palma, M., 2022. Computational advances for spatio-temporal multivariate environmental models. Comput. Stat. 37, 651–670. https://doi.org/10.1007/s00180-021-01132-0

- De Iaco, S., Palma. M., Posa, D., 2019. Choosing suitable linear coregionalization models for spatio-temporal data. Stoch. Environ. Res. and Risk Assess. 33, 1419–1434.

- Jang, C.S., Chen, S.K., Kuo, Y.M., 2012. Establishing an irrigation management plan of sustainable groundwater based on spatial variability of water quality and quantity. Journal of Hydrology, 414-415, 201–210

- Mastrocicco, M., Gervasio, M.P., Busico, G., Colombani, N., 2021. Natural and anthro- pogenic factors driving groundwater resources salinization for agriculture use in the Campania plains (Southern Italy). Science of the Total Environment, 758, 144033.

- Yazdanpanah, N. 2016. Spatiotemporal mapping of groundwater quality for irrigation using geostatistical analysis combined with a linear regression method. Model. Earth Syst. Environ., 2, 1-18.

How to cite: Palma, M., Maggio, S., Cappello, C., Congedi, A., and De Iaco, S.: Assessing the spatio-temporal changes of groundwater parameters: a multivariate geostatistical approach, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14743, https://doi.org/10.5194/egusphere-egu23-14743, 2023.

On-site presentation
Nunzio Romano, Annamaria Castrignanò, Carolina Allocca, and Paolo Nasta

Effective techniques for spatial data analysis are required to address the growing need to develop and improve management plans at regional or continental scales. The main purpose of this research is to evaluate the impact of change of spatial support on digital soil mapping. The study is based on the availability of more than 3,300 soil samples collected from the uppermost horizons of farmlands in Campania, an administrative region of southern Italy of about 13,700 km2. Each soil sample was subjected to laboratory tests to determine the following point-referenced primary soil properties: sand and clay contents, oven-dry soil bulk density, soil organic matter, pH, calcium carbonate, and rock fragments. These seven soil properties were mapped over the entire region by using as covariates the following terrain attributes, obtained from the digital terrain model (DTM; 75 m/pixel): elevation, slope, plan curvature, profile curvature, and flow accumulation. Two composite indicators of soil quality were then determined: 1) the soil organic carbon stock (SOCS) in Campania, and 2) the recharge transit time in the alluvial plain of the Sele River where information about the mean annual depth to groundwater is available.

In this study, it was crucial to evaluate the epistemic uncertainty associated with the change of support when fusing the point-referenced soil measurements with the block-based terrain attributes. A key issue of our analysis is the modification of the anamorphosis model based on a block rather than a point. Accordingly, our results show how the estimates change when a properly-corrected block Gaussian anamorphosis model is employed instead of the point Gaussian one.

By way of conclusion, when the main interest of an investigation is to obtain a map of average soil attributes, the change of support might have little influence on the final estimates, especially when working with nearly symmetrical distributions of soil properties. On the other hand, if one should infer the uncertainty of a variable, as in soil vulnerability mapping, then the change of support matters and is an issue to be adequately accounted for in the spatial analysis of environmental data.

How to cite: Romano, N., Castrignanò, A., Allocca, C., and Nasta, P.: How important is the change-of-support problem when digital soil mapping involves multi-source spatial data fusion? A real-world application of multivariate geostatistics to the regional scale of Campania (Italy)., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7928, https://doi.org/10.5194/egusphere-egu23-7928, 2023.

On-site presentation
Stylianos Hadjipetrou and Phaedon Kyriakidis

Wind assessment studies call for accurate and consistent datasets to evaluate the wind resource potential in the long term. Satellite-derived wind speed estimates have been widely employed in wind energy applications [1–3] due to their high spatial resolution. Synthetic Aperture Radar (SAR) sensors, in particular, provide image snapshots of wind fields on a (sub-) kilometer scale, although at irregular temporal intervals. Moreover, the scenes acquired are often tilted due to satellite’s orbit. The formed wind speed image time-series is, therefore, both spatially and temporally incomplete.

This study attempts to reconstruct Sentinel-1 A&B OCN Level-2 wind speed image time-series by employing a data-driven framework and using reanalysis as auxiliary data. More precisely, the methodology resembles what is generally called analog forecasting in climate studies, where past climate conditions are used to predict current weather state [4]. Although the analog method has been long used for empirical-statistical downscaling of Global Circulation Models (GCMs) [5,6], few studies address the problem of gap-filling record observations/estimates [7,8]. In the same context, Empirical Orthogonal Functions (EOF) are used in this work to classify (decompose) the data sets into classes of similar weather states and use this classification to reconstruct the missing information based on the co-registered climate variables. Once physically consistent patterns (analogs) are identified in the historical image record, synthetic wind speed images are generated to fill the data gaps.

The method is benchmarked in the offshore area around Cyprus against the probabilistic framework of Multiple-Point Statistics (MPS). Image cross-validation, in combination with statistical metrics, is used to evaluate the method’s performance. Results show that the proposed methodology can furnish a reliable framework for wind speed spatiotemporal variability reconstruction in an offshore wind resource assessment context. An illustration of the method in terms of wind power density estimation is provided over an annual period.


  • Nielsen, M.; Astrup, P.; Hasager, C.B.; Barthelmie, R.; Pryor, S. Satellite Information for Wind Energy Applications. 2004, 1479.
  • Medina-Lopez, E.; McMillan, D.; Lazic, J.; Hart, E.; Zen, S.; Angeloudis, A.; Bannon, E.; Browell, J.; Dorling, S.; Dorrell, R.M.; et al. Satellite Data for the Offshore Renewable Energy Sector: Synergies and Innovation Opportunities. Remote Sens. Environ. 2021, 264, 112588, doi:10.1016/j.rse.2021.112588.
  • Edwards, M.R.; Holloway, T.; Pierce, R.B.; Blank, L.; Broddle, M.; Choi, E.; Duncan, B.N.; Esparza, Á.; Falchetta, G.; Fritz, M.; et al. Satellite Data Applications for Sustainable Energy Transitions. Front. Sustain. 2022, 3, 64, doi:10.3389/frsus.2022.910924.
  • Dutton, J. What Is Analog Forecasting? - World Climate Service Available online: https://www.worldclimateservice.com/2021/09/02/what-is-analog-forecasting/ (accessed on 9 January 2023).
  • Bettolli, M.L. Analog Models for Empirical-Statistical Downscaling. Oxford Res. Encycl. Clim. Sci. 2021, doi:10.1093/acrefore/9780190228620.013.738.
  • Zorita, E.; Storch, H. von The Analog Method as a Simple Statistical Downscaling Technique: Comparison with More Complicated Methods in: Journal of Climate Volume 12 Issue 8 (1999) Available online: https://journals.ametsoc.org/view/journals/clim/12/8/1520-0442_1999_012_2474_tamaas_2.0.co_2.xml (accessed on 9 January 2023).
  • Hoeltgebaum, L.E.B.; Dias, N.L.; Costa, M.A. An Analog Period Method for Gap-Filling of Latent Heat Flux Measurements. Hydrol. Process. 2021, 35, doi:10.1002/hyp.14105.
  • Henn, B.; Raleigh, M.S.; Fisher, A.; Lundquist, J.D. A Comparison of Methods for Filling Gaps in Hourly Near-Surface Air Temperature Data. J. Hydrometeorol. 2013, 14, 929–945, doi:10.1175/JHM-D-12-027.1.

How to cite: Hadjipetrou, S. and Kyriakidis, P.: A data-driven framework for the reconstruction of satellite-derived wind speed image time-series, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5541, https://doi.org/10.5194/egusphere-egu23-5541, 2023.

Part II: Clutering and Classification
On-site presentation
Claudia Teutschbein, Andrijana Todorovic, and Thomas Grabs

The increasing availability of large data sets has fuelled the application of clustering approaches for discovering and interpreting spatio-temporal patterns in hydroclimatic data. Clustering can be particularly powerful for grouping catchments that span across various climate zones or hydrologic regimes into homogeneous clusters of similar hydrological or climatic behavior.

Here, we provide a practical example of how clustering can facilitate comprehensive analyses of streamflow drought characteristics across 50 Swedish catchments spanning three climate zones and ranging from snow-melt driven streamflow regimes in the North to rainfall-driven regimes in the South. To this end, the k-means clustering was applied to generate homogeneous clusters of catchments based on their similarity in streamflow anomalies (detected by using the standardized streamflow index) over the past 60 years. Five geographically distinct regions emerged from the clustering, linking the streamflow anomalies to the hydroclimatic conditions (following the north-south and elevation gradients), and to landscape characteristics, which strongly affect streamflow-generating processes at the catchment scale. Each cluster also featured – in line with its geographical location – a distinct hydrological regime.

Facilitated by the clustering, a clear north-south gradient emerged for many of the analysed drought statistics, including, e.g., drought duration, annual number of drought days and number of drought days in spring and summer, as well as standardized deficit volumes. Similarly, trends and changes in streamflow anomalies over the past 60 years also varied across clusters, with clusters in northern Sweden exhibiting wetting trends and clusters in southern Sweden drying trends.

This case study serves as an illustration of how clustering can be a valuable tool for improving our understanding and potential prediction of hydrological processes. Clustering enabled us to identify drought-prone areas and illuminated various drought behaviors, prevailing drought typologies, and seasonal differences that can be linked to the underlying streamflow regimes. 

How to cite: Teutschbein, C., Todorovic, A., and Grabs, T.: Clustering as a tool for identifying drought-prone regions: A Swedish example, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15612, https://doi.org/10.5194/egusphere-egu23-15612, 2023.

On-site presentation
Christian Narvaez-Montoya, Jürgen Mahlknecht, Juan Antonio Torres-Martínez, Abraham Mora, and Guillaume Bertrand

In coastal zones, groundwater overexploitation reduces freshwater outflow to the sea and causes seawater to migrate toward fresh groundwater resources, increasing salinity in groundwater reservoirs. This seawater intrusion is among the world's leading causes of groundwater pollution, as salty water can affect safe drinking consumption, food production, and ecosystem services. To explore this and others contaminations sources, cluster analysis has been used for decades to aid in water resource pattern recognition in coastal aquifers around the world. 

This work shows how cluster analysis has been applied for seawater intrusion pattern recognition in coastal zonas around the world between 2000 and 2022 through a systematic review based on the PRISMA statement. After the searching and selection stages, it was carried out the bibliometric analysis of the 81 identified studies. Furthermore, it was discussed information about the number of samples, number of variables, redundant variables, sample density per area, sample density per variable, clustering principal features, limitations for sources differentiation, assembly between methods, software, and pre-processing strategies. 

The identified methods were hierarchical clustering analysis (HCA), K-means clustering, Fuzzy C-means, and self-organizing maps (SOM). While 56 studies applied Q-mode for grouping water samples with similar characteristics, 17 applied R-mode for grouping variables, and 8 applied both modes. The preferred method was HCA with Ward´s linkage and Euclidean distance, but many studies didn’t specify the linkage or the distance criteria.  Of those studies that applied Q-mode, 77% associated at least one cluster with the influence of seawater intrusion. On the other hand, this work shows that 58% of the reviewed studies did not report raw data, which presents issues for validation, replication, and socialization of the results. 

How to cite: Narvaez-Montoya, C., Mahlknecht, J., Torres-Martínez, J. A., Mora, A., and Bertrand, G.: Coastal groundwater pattern recognition supported by cluster analysis, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2876, https://doi.org/10.5194/egusphere-egu23-2876, 2023.

On-site presentation
Usman T Khan and Everett Snieder

Improvements in large dataset availability and computing power have led to an increase in large-sample hydrological (LSH) studies. While these studies bring a breadth of new knowledge, they also introduce new challenges. One such challenge is the optimisation of model hyperparameters, which can be prohibitively computationally expensive on a large scale. Machine learning (ML)-based flow forecasting models have been steadily rising in popularity due to their high accuracy and ease of development. While traditional physics-based models have hyperparameters rooted in hydrological concepts (e.g., the number of hydraulic response units is determined based on spatial heterogeneity), ML-based models do not typically use a physical basis for selecting hyperparameters (e.g., neural network topology). Instead, ML model hyperparameters are typically determined using heuristic or exhaustive search methods.  Clustering has been previously applied to watersheds for identifying homogenous regions for flood frequency analyses. In these cases, unsupervised clustering, based on static watershed characteristics and flow statistics, is used to identify homogenous regions on which to conduct frequency analyses. We propose an application of clustering to optimise ML model hyperparameters on a large scale. The objective of this study is to determine whether grid-search optimisations are transferrable to similar catchments, identified through unsupervised clustering. Our study is conducted using a subset of Canadian catchments (n>500) from the HYSETS database. For each catchment, an LSTM is trained to forecast flow at a daily resolution using hydrometeorological input features (flow, precipitation, temperature, SWE). Grid-search hyperparameter optimisation is conducted on model architecture (number of hidden states and layers), learning rate, dropout rate, and input sequence length. We evaluate the effectiveness of cluster-based hyperparameter optimisation based on a comparison against a non-optimised baseline, for an increasing number of clusters. The impacts of this work have the potential to improve the effectiveness of ML-based flow forecasting models in cases where exhaustive hyperparameter searches are not possible. The results will also allow us to make recommendations for typical hyperparameter values based on watershed characteristics.

How to cite: Khan, U. T. and Snieder, E.: Cluster-based hyperparameter optimisation for LSTM-based flow forecasting in Canadian catchments, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9114, https://doi.org/10.5194/egusphere-egu23-9114, 2023.

On-site presentation
Sergio Lopez Dubon, Alessandro Sgarabotto, and Stefano Lanzoni

Meandering planforms are commonly observed in fluvial systems. A meander consists of a series of two alternate bends connected at the points of inflexion by relatively short, almost straight crossings. The presence of single-thread meandering rivers exhibiting a continuous sequence of such curves is widespread in alluvial floodplains. The study of river meanders has thus fascinated the scientific community, which, for a long time, has tried not only to classify them but also to quantify the complexity of meandering planforms and model their morphodynamic evolution.

The idea of classifying meandering rivers has a long history. It has produced a series of non-dimensional parameters to identify a meander (i.e., half-meander amplitude, asymmetry index, half-meander sinuosity). Nevertheless, two main problems arise from the existing methodologies. They are too complicated to encompass as many shapes as possible or lack physical insight into hydraulic and sedimentological parameters.

We propose a data-driven approach to address this classification issue, mixing physics-based information and machine-learning algorithms. In our approach, we consider the spatial distribution of meander curvatures and analyse it using different continuous wavelet transforms, getting the energy spectrum for each meander. This physics-based information is then firstly processed as an unsupervised visual classification problem using a neural-network autoencoder mix with cluster algorithms. The output of this first step analysis consists of two pre-trained algorithms that can classify the energy spectrum of pictures of planform curvatures and, therefore, the meander planform shape.

The algorithms will be trained with a series of dimensionless, synthetically generated meanders and t subsequently tested with both natural and simulated meanders. The final aim is to identify automatically which type of meanders characterise a given river reach at a certain time. This methodology also has the potential to be extended to Spatiotemporal distributions of channel-axis curvature, thus unravelling key aspects of meandering dynamics, as well as identifying similarities between reaches of different rivers or between observed and synthetically generated river planforms.

How to cite: Lopez Dubon, S., Sgarabotto, A., and Lanzoni, S.: A data-driven classification of meander bends based on their energy spectrum., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1049, https://doi.org/10.5194/egusphere-egu23-1049, 2023.

On-site presentation
Elizabeth Cooper, Rich Ellis, Eleanor Blyth, and Simon Dadson

Land surface models such as JULES (Joint UK Land Environment Simulator) are usually run on a rectilinear grid, yielding gridded outputs for variables such as soil moisture and evapotranspiration. JULES also models surface and subsurface water fluxes, and these can be used as inputs to a river routing model to predict river flows. Here we investigate the effect of clustering groups of grid cells into ‘Land Response Units’ (LRUs) in JULES, using a hierarchical multivariate clustering technique to group underlying grid cells together based on characteristics including soil type, elevation and land cover. Using LRUs rather than grid cells has the potential to reduce computational expense as well as providing an alternative to tiling approaches for capturing sub-grid heterogeneity. Here, LRUs are used exclusively in the land surface part of modelling, i.e., separate from river routing.

We investigate the effect of the LRU approach on JULES soil moisture in part of the Thames catchment in the UK, and compare LRU and gridded soil moisture predictions with measurements from the UKCEH COSMOS-UK soil moisture observation network. We find that use of LRUs leads to good soil moisture prediction while reducing computational expense compared to a gridded approach, but that this is strongly dependent on the characteristics used to create the LRUs. We also consider how the LRU approach impacts predicted river flows, and compare routed JULES outputs with observed river flow from a number of NRFA gauges in the catchment. We show that less computationally expensive LRU JULES outputs give similar river flow results to standard 1 km gridded JULES outputs when routed at 1km resolution, and that the LRU approach can outperform gridded river flow predictions when routed at higher resolution.

How to cite: Cooper, E., Ellis, R., Blyth, E., and Dadson, S.: Clustering grid cells in a land surface model, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13537, https://doi.org/10.5194/egusphere-egu23-13537, 2023.

Posters on site: Wed, 26 Apr, 10:45–12:30 | Hall A

Chairpersons: Svenja Fischer, Nilay Dogulu
Anne-Karin Cooke, Sandra Willkommen, and Stefan Broda

The world largely relies on groundwater extraction for drinking water supply, which is also the case in Germany. In the EU, the Water Framework directive regulates the standards for a chemically good state of water bodies. Thresholds are often exceeded due to fertilizers and pesticides. Methods to assess groundwater vulnerability to contamination to chemical compounds are mainly index-based, GIS-overlay tools. Other approaches are process-based leaching models and statistical approaches. Commonly used index methods remain conceptual in nature and lack validation with monitoring data. Process-oriented approaches tend to focus on the soil layer. Statistical approaches remain underexplored. In the project FARM (Groundwater vulnerability assessment during authorisation procedure of pesticides), we aim to improve groundwater protection by developing a data-based vulnerability index that exploits the existing extensive data bases and covers all relevant environmental conditions and agricultural inputs.

The federal groundwater quality monitoring infrastructure and of water suppliers delivered data of about 26.000 sites which are sampled for about 500 different pesticides. For pesticide monitoring data in Germany, such an exhaustive national database is unprecedented. Given this vast data set, this project aims to apply a fully data-driven approach to identify previously unknown, relevant factors and their interactions that drive groundwater vulnerability to pesticides. We aim to investigate the complex interactions between subsurface (soil, hydrogeology) and surface (meteorology, land use, crop sequences, agricultural practices) site characteristics with the physical-chemical properties (mobility, persistence) of pesticides. The potential of this data set will be exploited by the development, testing and validation of a supervised a machine-learning (ML) model. After an initial feature selection procedure, a Bayesian convolutional neural net will be trained on groundwater quality data and the mentioned extensive catalogue of features. This set-up takes the uncertainty into account introduced by the large percentage of left-censored data (concentrations below limit of quantification of the analytical method). High interpretability of the ML-model is essential, identified factors need to be comprehensible and actionable for decision-makers. We are dealing with a highly heterogeneous, asymmetric monitoring data set and strong biases in many variables are expected. This project thus pioneers in assessing the potential and suitability, as well as limitations and pitfalls of training neural nets on the status-quo of groundwater quality monitoring in Germany. A second major outcome of the project are specific recommendations on adjustments of the national monitoring (spectra of sampled substances, sampling frequency and timings, addition or reduction of monitoring wells in specific areas).

How to cite: Cooke, A.-K., Willkommen, S., and Broda, S.: First steps towards a data-driven groundwater vulnerability index for pesticides in Germany using probabilistic neural networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5569, https://doi.org/10.5194/egusphere-egu23-5569, 2023.

Thiago Victor Medeiros do Nascimento, Maria Teresa Condesso de Melo, and Rodrigo Proença de Oliveira

A new clustering strategy was developed and tested using Self-Organizing Maps (SOM), an unsupervised Artificial Neural Network (ANN) type, for identifying zones with similar contamination characteristics within an aquifer. The Gabros de Beja aquifer system (GBAS), located in the Alentejo region, Portugal, was selected as a case study due to its vulnerability to diffuse pollution from intensive agriculture. The proposed methodology consists of: (a) selection of the most representative groundwater contaminants in the aquifer area (i.e., nitrates, sulfates and chlorides); (b) determination of the Natural Background Level (NBL) of the selected groundwater compounds; (c) computation of the ratio between the median concentrations of the groundwater compounds being analyzed and their respective NBL concentration; and finally, (d) application of the SOM clustering technique to group homogenous contaminated areas within the aquifer. The NBL illustrates what thresholds are likely signs of anthropogenic effect by indicating how high or low a parameter's value would be expected under natural geogenic conditions and therefore was used as a first normalization of the dataset. For this methodology, the NBL was computed as the 90th percentile concentration of the selected compounds in piezometers within the study area that presented a median nitrate concentration smaller than 10 mg/L. Nitrate, sulfate and chloride concentration medians from 45 piezometers were used. The results show that the SOM network classified the piezometers into six classes (CL1 to CL6). The least contaminated clusters were CL1 (8) and CL4 (17), with all three compounds presenting median concentrations around 50 mg/L, which for nitrate is the threshold for drinking water limits. CL5 (5) reached median nitrate concentrations above 100 mg/L, while chlorides and sulfates remained below 50 mg/L. CL2 (6) showed an increase in chloride concentration to 100 mg/L, with the other two compounds' concentrations below 65 mg/L. CL3 (3) presented the highest salinization levels reaching chloride concentrations above 180 mg/L, with sulfates around 80 mg/L and nitrates around 50 mg/L. Finally, CL6 (6) presented median levels of the three compounds above 80 mg/L. The most contaminated groups (CL3, CL5 and CL6) were present in sedimentary and weathered metamorphic lithologies, which present high hydraulic conductivities, coinciding either with urban or agricultural areas associated with large-scale irrigation schemes, reinforcing the anthropogenic source of the contaminants. Hence, this study presented a clustering framework that, by reducing the dimensionality of the original dataset, helps to establish a priority list of polluted areas with different degrees of contamination, which is indeed essential for implementing monitoring and management measures for attenuating groundwater pollution.  

How to cite: Medeiros do Nascimento, T. V., Condesso de Melo, M. T., and Proença de Oliveira, R.: Framework for clustering groundwater quality using Self-Organizing Maps to improve aquifer monitoring and management: a case study of the Gabros de Beja aquifer system, Portugal, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-564, https://doi.org/10.5194/egusphere-egu23-564, 2023.

Emmanouil Varouchakis, Evgenia Diamantopoulou, and Andreas Pavlides

Geostatistical methods are increasingly used in earth sciences and engineering to improve space and time predictions. During mining activities, it is essential to monitor contaminant concentrations in soil and groundwater and estimate their spatial distribution in the area to guide environmental monitoring and reclamation once mining operations have been finished. In this work, we present the geostatistical analysis of the groundwater content in certain pollutants (Cd and Mn) in a group of adjacent mines. The available monitoring locations were Sixty-two. The challenge in this work is the grouped location of monitoring stations within the borders of the adjacent mines. This work aims to map the spatial distribution of Cd and Mn concentrations in groundwater in the entire mining area. The Correlation between Cd and Mn was investigated during the preliminary analysis of the data and found significant. The logarithm of the data values was used, and after removing a linear trend, the variogram parameters by means of a spherical model were estimated. In order to create the necessary contaminants concentration maps, we employed the Ordinary Kriging (OK) method and inversed the transformations. Cross-validation shows promising results (ρ = 92% for Cd and ρ = 88% for Mn, RMSE = 5.1 ppm for Cd and RMSE = 18.2 ppm for Mn), while the uncertainty was calculated in acceptable bounds.

How to cite: Varouchakis, E., Diamantopoulou, E., and Pavlides, A.: Geostatistical analysis of groundwater data in a mining area, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11472, https://doi.org/10.5194/egusphere-egu23-11472, 2023.

Aidyn Tileugabylov and Nasser Madani

The East Kazakhstan region is one of the most industrialised regions in the country producing considerable amounts of mineable copper, zinc, and gold due to mining activities. Development of metallurgical and mining industries has been increasing the pollution of surface waters by toxic chemical elements, particularly Cu, Mn, and Zn. We assessed the extent of these metal contaminations in surface waters in this region, by multivariate geostatistical analyses of the concentrations of the above-mentioned heavy metals over the five year periods (2017-2022). The dataset consists of element concentrations and sampling locations, for which it was provided by the Republican State Enterprise “Kazhydromet”. Principal Component Analysis (PCA) coupled with Simple Cokriging have been incorporated to characterise the local distribution of the heavy metals in surface waters in the East Kazakhstan region. The first component of PCA is used for further analysis since it qualifies for 74% of the total variation in the data. Then, a Simple Cokriging over the four continuous variables (PC1, Cu, Mn, and Zn) has been carried out to map their spatial distribution. Furthermore, the estimation map of PC1 is categorised, linking it to Cu, Mn, and Zn estimated maps; where the high, medium and low concentration areas of above-mentioned heavy metals are recognised over the entire region. The results are then interpreted by superimposing the river network into the estimation maps of elements.  

According to estimation maps, variations in concentrations of Cu, Mn, and Zn depend on the season and resulted in a distinct pattern. The surface waters in the region are mostly contaminated in spring and winter seasons due to snowfall and subsequent melting, whereas they are least contaminated during the summer and autumn. Moreover, it was observed that Cu shows the most mobility among the three toxic elements. A significant amount of Cu is discharged to the surface waters in Spring periodically, when snow melting activities are enhanced in Ust’-Kamenogorsk city, and transported to downstream regions. Therefore, higher concentrations of Cu near Semey city are observed during summer. The same effect has not been observed for Mn and Zn elements, which indicates that their overall mobility is lower.

Generally, observed Cu and Mn concentrations are exceeding the Maximum Allowable Concentration (MAC) by 5 times in the vicinity of Ust-Kamenogorsk and Ridder cities, while Zn exceeded its MAC by 10 times in the same region. One possible source of such high concentrations of heavy metals in this region is linked to the mining operations, especially to the Tailings Storage Facilities (TSF) that have been driving surface water contamination since the 1960s. After every relevant TSF has been superimposed on the estimated maps, the relationship between high metal concentration and TSF was investigated, leading to a conclusion that both active and closed Tailings Storage Facilities are the primary sources of surface water contamination in the study region. This shows that the situation of the study region in terms of surface water pollution is deplorable and needs urgent remediation actions.

Keywords: Multivariate Geostatistics, Water Contamination, Tailings Storage Facilities.

How to cite: Tileugabylov, A. and Madani, N.: Evaluation of Water Contamination in the East Kazakhstan Mining Area Using Multivariate Geostatistics, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-1737, https://doi.org/10.5194/egusphere-egu23-1737, 2023.

Edwige Vannier and Richard Dusséaux

Determing soil spatial variability is a key point in soil sciences either for soil preparation in precision agriculture, or because of influence on overland flow and erosion. Soil Surface Roughness (SSR) represents the undulation of the surface at small scale, due to the presence of small elevations and depressions. It results from tillage operations and changes over time due to weathering. SSR can be related to clod-size distribution. So, many research has been conducted on monitoring the size and number of clods using photogrammetry method. Nowadays, it is possible to acquire high resolution Digital Elevation Models (DEMs). This study seeks to model the evolution of clod size under rainfall impact with modeling and data processing tools.

Seedbed-like soil surface was made in the laboratory by filling a tray with loose soil of silt loam and setting upon pre-sieved clods. It was eroded by controlled laboratory rainfalls. A millimeter DEM was recorded at each stage of the surface with laser-scanner. Wavelet-based clod segmentation leaded to measurement of the volume of individual clods. Clod subsets were formed according to clod size. Normalized mean volume decrease was modelled by exponential function.

Greater clods showed swelling (volume increase) and erosion (volume decrease), with cumulated rainfall. These two phenomena being size dependent. Amplitude and slope parameters of the exponential decrease of clod volume could be modelled as a function of mean volume of the clod subset at initial stage. Results obtained with this surface strengthen those previously obtained with less data and basic segmentation. A power law is confirmed for amplitude parameter and a sigmoïd function is highlighted for slope parameter.

Modelling and data processing tools are efficient to differentiate and estimate the dynamics of clods depending on their size. Usually, clod size distribution is addressed with statistics of clod diameters obtained by real or numerical sieving. Working on 2.5 D DEMs gives also an acces to the vertical dimension of clods, which is included in their volume. This technique completes the usual roughness description and is promissing for use in precision agriculture or numerical surface generation.

How to cite: Vannier, E. and Dusséaux, R.: Modeling clod evolution under rainfall according to clod size., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2761, https://doi.org/10.5194/egusphere-egu23-2761, 2023.

Daniel Camilo Roman Quintero, Pasquale Marino, Giovanni Francesco Santonastaso, and Roberto Greco

The assessment of the response of slopes to precipitations is important for various applications, from water resources management to hazard assessment due to extreme rainfall events. It is well known that the underground conditions prior to the initiation of rainfall events control the hydrological processes that occur in slopes, affecting the water exchange through their boundaries. The present study aims at identifying hydrological variables to be monitored and modelled, suitable to improve the prediction of slope response to precipitations, for the case of a slope covered with loose pyroclastic coarse-grained soil overlaying a karstic bedrock, typical of southern Apennines (Italy). Field monitoring has been carried out for three years at the slope, including stream level recordings, meteorological recordings, and soil water content and suction measurements, which allowed setting up a physically based hydrological model of the slope, coupling the unsaturated flow in the soil cover with a perched aquifer developing in the fractured bedrock. To enlarge the field dataset, a synthetic dataset has been generated, linking a previously calibrated stochastic rainfall generator to the hydrological model. In this way, a synthetic dataset of 1000 years has been obtained, containing information on rainfall, aquifer water level and soil volumetric water content at different depths. Machine Learning techniques have been used to unwrap the relationships linking the studied variables, typically non-linear. The Random Forest technique has been used to assess the importance of each variable on the slope response, and the k-means clustering technique has been used to explore the geometrical disposition of data, so to identify seasonally recurrent different conditions controlling the slope response. The results indicate that the slope response, in terms of the fraction of rainwater remaining stored in the soil cover at the end of each rainfall event, can be predicted from the underground conditions prior to the rainfall initiation, weighting the role, on one hand, of the soil moisture excess above field capacity, controlling the ease of the water to flow in and out of the soil cover and, on the other hand, of the perched aquifer water level, that gives evidence of the activation of effective slope drainage.

How to cite: Roman Quintero, D. C., Marino, P., Santonastaso, G. F., and Greco, R.: Clustering and Random Forest Analysis for the Identification of Hydrological Controls of Slope Response to Rainfall, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11154, https://doi.org/10.5194/egusphere-egu23-11154, 2023.

Nima Shokri, Mehdi Mahdavi Ara, Sobhan Ansari, and Mohammad Sharifi

Land subsidence referring to the lowering of Earth’s land surface poses destructive threats to buildings and infrastructures and increases vulnerability to floods. The tendency to overexploit groundwater resources due to ever-increasing demand for water in urban areas is known as one of the main drivers for land subsidence, especially in regions with compressible sediments or formations susceptible to changes in groundwater pressure. Land subsidence has been observed in many countries around the globe including but not limited to USA, Mexico, Spain, Italy, Saudi Arabia, Iran, India, Vietnam, and China.

Artificial intelligence (AI) and machine learning algorithms prove to be of great values to assess and predict a variety of hydrological and environmental dynamics and trends (Hassani et al., 2021; Mahdaviara et al., 2022). Leveraging on this opportunity, we develop a new framework, assisted by AI approaches, to quantify and predict how various environmental and climatic parameters influence the occurrence and extent of land subsidence. We show the general applicability of the proposed framework through the case of land subsidence observed in Iran, i.e. a semi-arid to arid country which strongly relies on the limited groundwater resources for a wide range of activities. The country hosts some of the fastest-sinking cities in the world. As a case study, we focused on the land subsidence observed in Varamin plain located in central Iran with an average annual precipitation of 420 millimeters and 210 millimeters of subsidence per year in the last 20 years.  A combination of the field and satellite data over the last two decades was prepared for the training of the models. In the next level, the training matrix was exposed to the AI algorithms aiming to develop models relating the land subsidence rate to a variety of environmental and climatic factors. Our preliminary analysis suggests that the groundwater withdrawal and precipitation rate are among the most important parameters affecting the rate of subsidence. The modelling tools will be used to detect the potential hotspots for land subsidence under different water management and climate change scenarios in other places. This will be helpful for preventing the forthcoming damages and devising the necessary action plans to mitigate the situation under different conditions.



Hassani, A., Azapagic, A., Shokri, N. (2021). Global Predictions of Primary Soil Salinization Under Changing Climate in the 21st Century, Nat. Commun., 12, 6663. doi.org/10.1038/s41467-021-26907-3.

Mahdaviara, M., Sharifi, M., Bakhshian, B., Shokri, N. (2022), Prediction of Spontaneous Imbibition in Porous Media Using Deep and Ensemble Learning Techniques, Fuel, 329, 125349.

How to cite: Shokri, N., Mahdavi Ara, M., Ansari, S., and Sharifi, M.: Toward prediction of land subsidence assisted by artificial intelligence approaches, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-5025, https://doi.org/10.5194/egusphere-egu23-5025, 2023.

Mirko Mälicke, Alberto Guadagnini, and Erwin Zehe

We provide an extension of a well established geostatistical software to allow for effective and interactive assessment of environmental scenarios in a geostatistical context. The extension comprises a pre-built interface and a freely accessible demo application.

The heat of the approach relies on replacing a sample variogram with its uncertainty bound. Doing so enables one to fully and consistently embed various sources of uncertainties stemming from available datasets and methodological approaches employed for their interpretation. Methodological approaches included in the software include capabilities leading to: i) a statistical estimation of uncertainty bounds from residual point-pair distributions; ii) a statistical robustness test for uncertainty bounds; and iii) a Monte Carlo simulation tool to propagate a variety of aleatory uncertainties. 

We illustrate the capabilities of our approach and software through the analysis of two different datasets. We focus on manual variogram estimation to comprehensively illustrate how insights on uncertainty can be used to reject candidate variogram models or model parameter sets.

How to cite: Mälicke, M., Guadagnini, A., and Zehe, E.: SciKit-GStat Uncertainty: A software extension to cope with uncertain geostatistical estimates, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-6683, https://doi.org/10.5194/egusphere-egu23-6683, 2023.

Martijn van Leer, Willem Jan Zaadnoordijk, Alraune Zech, Jasper Griffioen, and Marc Bierkens

Aquitards are common hydrogeological features in the subsurface and its properties are important for e.g. water resource management, subsidence, contamination transport and aquifer thermal energy storage. Typically pumping test are used to parameterize the hydraulic conductivity of aquitards. However, with analytical interpretation of pumping tests it is difficult to take spatial variability and uncertainty into account. Alternatively, core-scale measurements of hydraulic conductivity are used in geostatistical upscaling methods, for which their correlation lengths are needed. However, this information is extremely difficult to obtain. In this study we investigate whether a pumping test can be used to obtain the correlation lengths needed for geostatistical upscaling and  account for the uncertainty about heterogeneous aquitard conductivity. We generated random realizations from core scale data with varying correlation lengths and inserted these into a groundwater flow model which simulates the outcome of an actual pumping test. We selected the realizations which yielded a better fit to the pumping test data than the traditional pumping test result assuming homogeneous layers. Ranges of horizontal and vertical correlation lengths that fit the pumping test well are found. However, considerable uncertainty regarding the correlation lengths remains which should be considered when parameterizing a regional groundwater flow model.

How to cite: van Leer, M., Zaadnoordijk, W. J., Zech, A., Griffioen, J., and Bierkens, M.: Estimating correlation lengths of aquitard hydraulic conductivity by inverse geostatistical modelling of a pumping test, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8113, https://doi.org/10.5194/egusphere-egu23-8113, 2023.

Posters virtual: Wed, 26 Apr, 10:45–12:30 | vHall HS

Chairpersons: Svenja Fischer, Jaime Gómez-Hernández
Sudan Pokharel, Tirthankar Roy, and David Admiraal

Accurate and timely prediction of peak flow in streams is essential for transportation safety as these estimates can help transportation authorities implement precautionary measures (e.g., road closures, diversion, emergency routes, transportation planning, flood impact assessment, etc.) well ahead of time to mitigate the impacts of flooding on transportation. Often in practice, flow quantiles are estimated from catchment and climate attributes using simple methods such as linear regression, which overlooks the more complex nature of relationships between variables, potentially leading to errors and uncertainties in the estimates that can trickle down to engineering design. Here, we will discuss findings from our ongoing work on accurate estimation of peak flow using machine learning algorithms. The methodology involves a two-step process. First, k-means clustering is implemented to identify regions that have similarities in the mean annual runoff. Second, Random Forest is implemented to map a wide range of climate and catchment features to flow quantiles in each cluster. To assess the effectiveness of this approach in increasing transportation resilience, we will show how the peak flow estimates from this new approach compare with the estimates from the existing approach followed by the Nebraska Department of Transportation and explore the potential of these new estimates to be used for operational purposes for flood-related decision-making in the context to transportation infrastructure.

How to cite: Pokharel, S., Roy, T., and Admiraal, D.: Machine learning-based peak flow estimation for improved flood resilience of transportation infrastructure, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10053, https://doi.org/10.5194/egusphere-egu23-10053, 2023.

Gaspar Salas-Ruelas, Hugo Enrique Júnez-Ferreira, Gamaliel Moreno Chavez, Julián González Trinidad, Carlos Francisco Bautista Capetillo, and Graciela Herrera Zamarrón

The hydraulic head is an important variable to determine the functioning of water in the subsoil; however, its spatial characterization is complicated due to the variability it presents in an aquifer. Measuring hydraulic head in piezometers or observation wells involves costs, so in some cases there is little data available. To obtain reliable configurations of the hydraulic head spatial distribution in an aquifer, interpolation methods that require few measurements have been used. Ordinary kriging is one of the most widely used spatial interpolation algorithms in geostatistics, which employs a theoretical variogram (circular, exponential, Gaussian, etc.). The variogram is a function whose parameters (nugget, sill and range) must be optimized because the accuracy of the estimation depends on them. As far as it has been reviewed in the literature, the adjustment of theoretical variograms has been carried out by means of genetic algorithms considering bi-objective functions where only the error in the adjustment of the variogram and the difference between the measured values and the estimated values by means of ordinary kriging are taken into account. In this paper we propose the adjustment using a new multiobjective function, where simultaneously the variogram adjustment, the accuracy of the interpolation result and the estimation error variances are considered. This nonlinear optimization problem contains three secondary objectives. The first is to obtain the best fit between the experimental variogram and the theoretical variogram function. Secondly, the aim is to minimize the difference between the measured values and the ordinary kriging estimates (measured with the mean square error) and thirdly that the error variances in the estimation are well represented by the selected model (using the standard mean square error). The tests of the proposed procedure were carried out with data measured in El Palmar aquifer located in the northern part of the state of Zacatecas, Mexico. The performance of this procedure was evaluated for different weights assigned to each of the secondary objectives. In the models where only the variogram adjustment is considered, the mean squared error and the standardized mean squared error turned out to be very large, it was also observed that when the estimation error variance is not taken into account in the objective function, the standardized mean squared error ranges from 20.94 to 56.41. It was observed that when the estimation error variance is incorporated in the objective function (even when its weight is small) the estimation errors are very close to the minimum obtained and that the variances are very reliable (with the standardized mean square error between 0.65 and 1.35).

How to cite: Salas-Ruelas, G., Júnez-Ferreira, H. E., Moreno Chavez, G., González Trinidad, J., Bautista Capetillo, C. F., and Herrera Zamarrón, G.: Fitting variogram models based on estimation errors and variances using genetic algorithms, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10925, https://doi.org/10.5194/egusphere-egu23-10925, 2023.

Pooria Ebrahimi and Fabio Matano

A clear understanding of the groundwater system plays a leading role in the effective management of water resources and sustainable development. There is, therefore, a need to consider all available datasets, collate other supplementary data and update the present hydrogeological databases as an important source of information for decision-makers. Springs could be considered as hydraulic features for characterizing the basin-scale groundwater flow when there are not wells in a study area or there is limited access to them. In this study, the springs in some river basins in Molise region (southern Italy) are investigated. Between 1 and 1556 m a.s.l. (622 m a.s.l. on average), a total of 2681 springs (1620 perennial and 1061 non-perennial springs) were identified based on the Istituto Geografico Militare topographic maps at a 1:25000 scale. In springs, the hydraulic head is almost equal to the elevation head (h≈z). Regarding that groundwater flows from high to low hydraulic head (h), it could be concluded that the groundwater body generally flows from the mountainous area in the south and southwest towards the coast of Adriatic Sea in the north and northeast.

For further investigation, 1237 springs in Fortore and Saccione river basins were considered and the following factors (as indicators of the areas with a high probability of groundwater spring presence) were obtained from the digital elevation model for spring orifices: altitude, slope degree, slope aspect, curvature, plan curvature, profile curvature, topographic wetness index (TWI), stream transport index (STI) and stream power index (SPI). Following log-transformation of altitude, slope degree, slope aspect, SPI, STI and TWI for obtaining more symmetric statistical distribution, the springs were categorized into three groups through the Grouping Analysis in ArcMap 10.8: Group 1 with 101 springs; Group 2 with 1003 springs; and Group 3 with 132 springs. The springs in Group 2, Group 3 and Group 1 occur at high (639 m), medium (475 m) and low (150 m) altitudes, respectively. The slope of spring orifices in Group 2 and Group 3 is almost similar, but steeper than that of Group 1. The SPI and STI increase from Group 1 to Group 3 while the TWI and slope aspect are not significantly different between the spring groups. The R-squared values show that altitude and slope are the most important variables for discriminating the groups. A literature study shows a greater probability of spring groundwater occurrence in areas at a higher altitude and with a steeper slope, but this should be confirmed in our study area after applying some modeling techniques and considering more complex relationships.

This study presents a general overview of groundwater hydrogeology in some river basins in Molise region. It is noteworthy that the project is still ongoing and the database will be updated with a wider range of variables (e.g., hydrogeological complexes, distance to tectonic elements, spring discharge and spring water temperature when available) to obtain a comprehensive spring database and empower researchers supporting decision-makers for groundwater management.

How to cite: Ebrahimi, P. and Matano, F.: A regional spring database for groundwater management: The preliminary results of a case study in Molise region (south Italy) and the future perspectives, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14797, https://doi.org/10.5194/egusphere-egu23-14797, 2023.

Claudia Cappello, Sandra De Iaco, Monica Palma, Christoph Muehlmann, and Klaus Nordhausen

In environmental sciences, it is common to collect and analyze spatio-temporal multivariate data concerning several variables which are measured in time over a spatial domain. The spatio-temporal data are usually sparce in space, due to the high cost of the equipment, and temporal dense since the required variables are regularly sampled in time.

In the literature different methods have been proposed for the analysis of such spatio-temporal data which exhibit a correlation in space and time as well as in-between variables. Among them it is worth recalling the generalization of Blind Source Separation technique for multivariate space-time random field (stBSS) and the space-time linear coregionalization model (ST-LCM). These methods are useful to simplify the spatio-temporal multivariate analysis since by a linear transformation of the original observations only the independent components which exhibit a spatio-temporal correlation are retained (lower than the number of observed variables) and modelled.

In this paper a multivariate study regarding seven environmental variables (evapotranspiration level, minimum and maximum temperature, minimum and maximum humidity, wind speed and precipitation) measured between 2000 and 2022 in Veneto region (Italy) will be proposed.  Both the stBSS and the joint diagonalization of the empirical covariance matrix approach will be used to identify the hidden components, and properly chosen spatio-temporal models will be fitted to the latent components. Note that for the first approach a BSS model for the multivariate random field will be assumed, whereas for second one a space-time linear coregionalization model (ST-LCM) for the independent components will be fitted to the matrix-valued covariance function estimated for seven relevant environmental variables.

Finally, the fitted models have been used to predict evapotranspiration levels and a comparison of the values obtained by using the two different techniques will be provided.

How to cite: Cappello, C., De Iaco, S., Palma, M., Muehlmann, C., and Nordhausen, K.: Space-time multivariate techniques: a comparative analysis on environmental data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-16194, https://doi.org/10.5194/egusphere-egu23-16194, 2023.