EGU24-13417, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-13417
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Evaluating physics-based representations of hydrological systems through hybrid models and information theory

Manuel Álvarez Chaves1, Eduardo Acuña Espinoza2, Uwe Ehret2, and Anneli Guthke1
Manuel Álvarez Chaves et al.
  • 1Stuttgart Center for Simulation Science, University of Stuttgart, Stuttgart, Germany
  • 2Institute of Water Resources and River Basin Management, Karlsruhe Institute of Technology, Karlsruhe, Germany

Hydrological models play a crucial role in understanding and predicting streamflow. Recently, hybrid models, combining both physical principles and data-driven approaches, have emerged as promising tools to extract insights into system functioning and increases in model predictive skill which are beyond traditional models.

However, the study by Acuña Espinoza et al. (2023) has raised the question whether the flexible data-driven component in a hybrid model might "overwrite" the interpretability of its physics-based counterpart. On the example of conceptual hydrological models with dynamic parameters tuned by LSTM networks, they showed that even in a case where the physics-based representation of the hydrological system is chosen to be nonsensical on purpose, the addition of the flexible data-driven component can lead to a well-performing hybrid model. This compensatory behavior highlights the need for a thorough evaluation of physics-based representations in hybrid hydrological models, i.e., hybrid models should be inspected carefully to understand why and how they predict (so well).

In this work, we provide a method to support this inspection: we objectively assess and quantify the contribution of the data-driven component to the overall hybrid model performance. Using information theory and the UNITE toolbox (https://github.com/manuel-alvarez-chaves/unite_toolbox), we measure the entropy of the (hidden) state-space in which the data-driven component of the hybrid model moves. High entropy in this setting means that the LSTM is doing a lot of "compensatory work", and hence alludes to an inadequate representation of the hydrological system in the physics-based component of the hybrid model. By comparing this measure among a set of alternative hybrid models with different physics-based representations, an order in the degree of realism of the considered representations can be established. This is very helpful for model evaluation and improvement as well as system understanding.

To illustrate our findings, we present examples from a synthetic case study where a true model does exist. Subsequently, we validate our approach in the context of regional predictions using CAMELS-GB data. This analysis highlights the importance of using diverse representations within hybrid models to ensure the pursuit of "the right answers for the right reasons". Ultimately, our work seeks to contribute to the advancement of hybrid modeling strategies that yield reliable and physically reasonable insights into hydrological systems.

References

  • Acuña Espinoza, E., Loritz, R., Álvarez Chaves, M., Bäuerle, N., & Ehret, U. (2023). To bucket or not to bucket? analyzing the performance and interpretability of hybrid hydrological models with dynamic parameterization. EGUsphere, 1–22. https://doi.org/10.5194/egusphere-2023-1980

How to cite: Álvarez Chaves, M., Acuña Espinoza, E., Ehret, U., and Guthke, A.: Evaluating physics-based representations of hydrological systems through hybrid models and information theory, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13417, https://doi.org/10.5194/egusphere-egu24-13417, 2024.