EGU24-3783, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-3783
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Generalizability evaluation of heterogeneous ensembles models for benthic macroinvertebrate index predictions

Taeseung Park, Jihoon Shin, and YoonKyung Cha
Taeseung Park et al.
  • University of Seoul, Envrironmental engineering, Korea, Republic of (alexl001@uos.ac.kr)

Predictive models, which leverage the relationship between environmental variables and river health, serve as a valuable tool for predicting the river health at unmonitored sites. Such models should be generalizable to unseen data. However, predictions derived from machine learning (ML) models can exhibit large variability even with minor changes in the training dataset. The potentially unstable behaviors of a ML model decrease the model’s generalizability to unseen data, likely limiting its applicability as an assistant tool for decision making. Heterogeneous ensemble models are recognized to achieve greater generalizability compared to single models owing to their structural diversity. In this study, various machine learning (ML) models are employed to understand the relationship between environmental factors and benthic macroinvertebrate health. To obtain a model with better generalizability, the present study compares the generalizability of heterogeneous ensembles with those of homogeneous ensembles and single models by using the bias–variance decomposition. The models classified five grades (very good to very poor) of benthic macroinvertebrate index (BMI). The models incorporated diverse environmental factors, including water quality, hydrology, meteorological conditions, land cover, and stream properties, as input variables. The data were monitored at 2,915 sites in the four major river watersheds in South Korea during the 2016–2021 period. The results indicated better generalizability of the heterogeneous and homogeneous ensembles than single models. Moreover, heterogeneous ensembles tended to show higher generalizability than homogeneous ensembles, although the differences were marginal. Weighted soft voting was the most generalizable of the heterogeneous ensembles, with loss of 0.49. Weighted soft voting also delivered acceptable classification performance on the test set, with accuracy of 0.52. The identified contributions of the environmental factors to BMI predictions and the directions of their effects agreed with established knowledge, confirming the reliability of the predictions. These results demonstrate the usefulness of the heterogeneous ensemble models for increasing the generalizability of ML model predictions. Furthermore, despite the slightly lower generalizability than voting-based ensembles, homogeneous ensembles demonstrated comparable levels of generalizability to heterogeneous ensembles.

How to cite: Park, T., Shin, J., and Cha, Y.: Generalizability evaluation of heterogeneous ensembles models for benthic macroinvertebrate index predictions, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-3783, https://doi.org/10.5194/egusphere-egu24-3783, 2024.