EGU26-17203, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-17203
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Wednesday, 06 May, 11:45–11:55 (CEST)
 
Room 2.24
Spatial Non-Stationarity in Harmful Algal Bloom Drivers for the benthic dinoflagellate Gambieridscus spp in the Balearic Islands, Revealed Through Interpretable Machine Learning and Hierarchical Modelling
Diana Yaritza Dorado Guerra, Sandra Gimeno Monforte, Carles Alcaraz Cazorla, and Jorge Diogène Fadini
Diana Yaritza Dorado Guerra et al.
  • Institute of Research and Technology in Food and Agriculture, Marine and Continental Waters Programme, Spain (dianador87@hotmail.com)

Harmful algal blooms (HABs) are massive proliferations of microalgae in aquatic ecosystems that may be harmful to the ecosystems or to society. Predicting HABs in spatially complex coastal environments requires understanding the potential environmental drivers that may determine microalgal population dynamics. When considering the study of HABs we may evaluate if these processes are spatially invariant or if they demonstrate site-specific dynamics. Machine learning models often achieve high training performance but fail when extrapolating to unseen locations due to site-specific overfitting. We developed a methodological framework integrating hierarchical modelling, spatially explicit machine learning, and interpretable AI techniques to quantify spatial heterogeneity in HAB environmental drivers.

Gambierdiscus spp is a genus of benthic marine microalgae (dinoflagellate) that are found in coastal areas and that produce potent marine toxins which are transferred mainly to fish. We analysed 348 observations of Gambierdiscus spp. abundances across 32 sites in the Balearic Islands (2021-2024), integrating field abundance data with satellite-derived oceanographic variables (temperature, nutrients, hydrodynamics) from Copernicus Marine Service. Seven modelling approaches were compared: Generalized Additive Mixed Models (GAMM), Generalized Additive Models (GAM), Geographically Weighted Regression (GWR), Random Forest (RF), Geographic Random Forest (GRF), XGBoost, and Geographic XGBoost. A three-phase feature selection procedure (temporal lag optimization, collinearity removal via VIF, LASSO regularization) reduced 61 candidate predictors to 12 ecologically interpretable variables optimized for spatial modelling.

Model validation employed Leave-One-Out Cross-Validation (LOO-CV) to assess true spatial generalization rather than interpolation. Machine learning models achieved high training performance (R²=0.75-0.85) but collapsed under spatial extrapolation (R²_LOO=0.30-0.40). In contrast, GAMM demonstrated superior spatial transferability (R²_LOO=0.47), attributable to its explicit separation of fixed environmental relationships from hierarchical site-specific random effects. SHAP (SHapley Additive exPlanations) analysis on island-stratified Random Forest models quantified spatial non-stationarity: temperature importance varied 13-fold across islands (SHAP: 0.05-0.64), while phosphate exhibited 2.6-fold consistency (SHAP: 0.10-0.26). Partial dependence plots verify that drivers operate through fundamentally different mechanisms across the archipelago.

Significant spatial clustering (Moran's I=0.346, p<0.001) with persistent hotspots and coldspots validated non-stationarity. Phosphate emerged as the only universal driver, while temperature, substrate, and hydrodynamics exhibited location-dependent effects. Our findings demonstrate that interpretable ML combined with spatial cross-validation effectively diagnoses when environmental relationships transfer versus when they require location-specific calibration, providing a generalizable framework for spatial prediction in heterogeneous ocean systems.

How to cite: Dorado Guerra, D. Y., Gimeno Monforte, S., Alcaraz Cazorla, C., and Diogène Fadini, J.: Spatial Non-Stationarity in Harmful Algal Bloom Drivers for the benthic dinoflagellate Gambieridscus spp in the Balearic Islands, Revealed Through Interpretable Machine Learning and Hierarchical Modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17203, https://doi.org/10.5194/egusphere-egu26-17203, 2026.