A stacking ensemble machine learning framework with terminal bias correction for flood prediction

Xinyu Chang; Jun Guo; Tianlong Jia; Hui Qin; Yi Liu

doi:https://doi.org/10.5194/egusphere-egu26-8326

[Back] [Session HS3.1]

EGU26-8326, updated on 14 Mar 2026

https://doi.org/10.5194/egusphere-egu26-8326

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

A stacking ensemble machine learning framework with terminal bias correction for flood prediction

Xinyu Chang^1,2,3, Jun Guo^1,2, Tianlong Jia³, Hui Qin^1,2, and Yi Liu^1,2

Xinyu Chang et al.

¹School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, China
²Hubei Key Laboratory of Digital River Basin Science and Technology, Huazhong University of Science and Technology, Wuhan, China
³Institute of Water and Environment, Karlsruhe Institute of Technology, Karlsruhe, Germany

The accuracy and robustness of flood forecasting have long been constrained by model structural uncertainties and runoff generation mechanisms. Single hydrological models or machine learning approaches not only show limited performance improvements but also struggle to achieve balanced simulation of both high-flow and low-flow processes. To address this, this study proposes for the first time a stacking ensemble machine learning framework (TBC-SEML) that integrates multi-model state awareness, terminal bias correction, and interpretability analysis. The framework leverages classical hydrological models (GR4J, HYMOD, SIMHYD) to acquire multi-model state datasets, establishing comprehensive evaluation metrics (NPCEM) as the optimization objective to enhance capture of high-flow processes. Furthermore, this study innovatively proposes a terminal bias correction based on Auto-Regressive with Extra Inputs and Weighted Least Square (ARX-WSL), and the excessive dominance of flood peak on weight estimation is suppressed by the flow attenuation coefficient β. Building on this, eight types of base learners are integrated, including Random Forest (RF), ExtraTrees, XGBoost, LightGBM and CatBoost, Multilayer Perceptron (MLP), Support Vector Regression (SVR), and K-Nearest Neighbors (KNN). Bayesian methods are used to optimize the hyperparameters of the base learners, and a meta-learner is constructed based on linear regression. Meanwhile, the SHAP interpretability analysis method is introduced to quantify the predictive contributions of base learners and state variables, enhancing model transparency. This highly diverse and heterogeneous stacking ensemble framework not only enhances the complementarity among base learners but also achieves good synergy between accuracy, stability, and interpretability, providing a new paradigm for intelligent hydrological forecasting that combines high performance and transparent decision support.

How to cite: Chang, X., Guo, J., Jia, T., Qin, H., and Liu, Y.: A stacking ensemble machine learning framework with terminal bias correction for flood prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8326, https://doi.org/10.5194/egusphere-egu26-8326, 2026.