- 1Sapienza University of Rome, DICEA, Rome, Italy (saba.gachpaz@uniroma1.it), University of Genova, DICCA, Genova, Italy(saba.gachpaz@edu.unige.it)
- 2University of Genova, DICCA, Genova, Italy(Giorgio.Boni@unige.it)
- 3University of Genova, DITEN, Genova, Italy(Gabriele.Moser@unige.it)
- 4University of Genova, DICCA, Genova, Italy(Bianca.Federici@unige.it)
Soil moisture (SM) is the amount of water contained in soil and is a key variable at earth surface which controls many processes like erosion, evapotranspiration and infiltration. In deeper layers and root zone area, it controls vegetation health and surface coverage conditions. Traditional methods for SM monitoring, such as field-based measurements, are accurate but provide only point-based results. Given the heterogeneous nature of SM across time and space, remote sensing and machine learning (ML) techniques have emerged as valuable tools. These approaches can efficiently handle large datasets, and provide measurements at regular intervals, offering an innovative alternative for SM estimation in large scale regions.
In this study we evaluated the potential of multi-spectral optical remote sensing for SM estimation by examining the dependency between Sentinel-2 images and SM measurements from two datasets: the REMEDHUS network (Spain) and the SMOSMANIA network (France). The REMEDHUS network is located in an agricultural region while the SMOSMANIA network spans a 400 km transect from the Mediterranean Sea to the Atlantic Ocean. Both networks provide hourly SM measurements at the depth of -5cm, with additional measurements at depths of -10cm, -20cm, -30cm for the SMOSMANIA network. To achieve this, Harmonized Sentinel-2 (MSI) data, from Google Earth Engine, were used to predict SM. Surface reflectance from 12 spectral bands, along with Normalized Difference Vegetation Index (NDVI), Normalized Difference water Index (NDWI) and Enhanced Vegetation Index (EVI) were used as features in regression models and recorded SM (close to the time of satellite overpass) was taken as the target variable. To ensure consistency in the analysis, the Sentinel-2 image collection was filtered by location (the coordinate of each sensor), study period (2017-2022) and cloud cover (maximum acceptable cloud cover = 10%). Later all spectral bands were resampled to a uniform spatial resolution of 10 meters.
Three ML algorithms were applied to model the relationship between predictor variables and SM: Random Forest Regression (RF), Support Vector Regression (SVR), and Gradient Boosting Regression (GBR). Model performance was assessed using Root Mean Square Error (RMSE). For the REMEDHUS network, RF achieved the best performance with an RMSE of 0.08319 m³/m³. In the SMOSMANIA network, all three algorithms performed best at a depth of -20 cm, with SVR achieving the lowest RMSE (0.0591 m³/m³). Additionally, the weighted vertical average from the SVR model yielded the lowest overall RMSE of 0.0551 m³/m³.
Comparisons between actual and predicted SM values for each testing sensor confirm the role of land-use type on model’s performance. Another consideration is the model's ability to predict SM within specific moisture content ranges. Although data from the SMOSMANIA and REMEDHUS networks exhibit completely different measured SM values and originate from different land use types, both models demonstrate optimal performance within the range of 0.1 to 0.4 m³/m³, with RMSE = 0.034 for REMEDHUS network and RMSE= 0.04 for SMOSMANIA network.
How to cite: Gachpaz, S., Boni, G., Moser, G., and Federici, B.: Machine Learning-Based Soil Moisture Estimation Using Sentinel-2 MSI Data: Case Studies from the REMEDHUS (Spain) and SMOSMANIA (France) Networks, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-18781, https://doi.org/10.5194/egusphere-egu25-18781, 2025.