Time to Update the Split Sample Approach to Hydrological Model Calibration: A Massive Empirical Study
- Dept. of Civil and Environmental Engineering, University of Waterloo, Waterloo, Canada (hongren.shen@uwaterloo.ca)
Model calibration and validation are critical in hydrological model robustness assessment. Unfortunately, the commonly used split-sample test (SST) framework for data splitting requires modelers to make subjective decisions without clear guidelines.
A massive SST experiment for hydrological modeling is proposed and tested across a large sample of catchments to empirically reveal how data availability and calibration period features (i.e., length and recentness) simultaneously impact model performance in the post-validation period (e.g., forecasting or prediction), thus providing practical guidance on split-sample design. Unlike most SST studies that use two sub-periods (i.e., calibration and validation) to build models, this study incorporates an independent model testing period in addition to calibration and validation periods. Model performance of two lumped conceptual hydrological models (i.e., GR4J and HMETS) are calibrated and tested in 463 CAMELS catchments across the United States using 50 different data splitting schemes. These schemes are established regarding the data availability, length, and data recentness of the continuous calibration sub-periods (CSPs). A full-period CSP is also included in the experiment, which skips model validation entirely. The results are synthesized regarding the large sample of catchments and are comparatively assessed in multiple novel ways, including how model building decisions are framed as a decision tree problem and viewing the model validation process as a formal testing period classification problem, aiming to accurately predict model success/failure in the testing period.
Results span different climate and catchment conditions across a 35-year period with available data, making conclusions generalizable. Strong patterns show that calibrating to older data and then validating models on newer data produces inferior model testing period performance in every single analysis conducted and should hence be avoided. Calibrating to the full available data and skipping model validation entirely is the most robust split-sample decision. Findings have significant implications for SST practice in hydrological modeling. As the next phase of this study, results for discontinuous calibration sub-periods (DCSP) will be evaluated as an alternative SST design choice and contrasted then with the CSP results.
How to cite: Shen, H., Tolson, B., and Mai, J.: Time to Update the Split Sample Approach to Hydrological Model Calibration: A Massive Empirical Study, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10846, https://doi.org/10.5194/egusphere-egu22-10846, 2022.