EGU25-4768, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-4768
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
PICO | Monday, 28 Apr, 09:03–09:05 (CEST)
 
PICO spot A, PICOA.9
Extending Caravan with additional weather nowcast and forecast products
Guy Shalev1 and Frederik Kratzert2
Guy Shalev and Frederik Kratzert
  • 1Google, Research, Tel Aviv, Israel (guysha@google.com)
  • 2Google, Research, Vienna, Austria (kratzert@google.com)

About two years ago, we started the Caravan community dataset. The idea behind was two-fold: 

1) Caravan standardizes the streamflow data from different regional large-sample hydrology datasets (e.g. various CAMELS datasets) and combines them with data from globally available data sources. 

2) All data in Caravan, besides streamflow, is derived from Google Earth Engine, with code that has been made publicly available (https://github.com/kratzert/Caravan/ ) , allowing anyone to extend the dataset to new regions.

Additionally, the dataset structure allows for easy integration of what we call “community extensions” and so far, six different community extensions (https://github.com/kratzert/Caravan/discussions/10 ) have been made available, extending Caravan to a total of 22494 gauges.

With this submission, we want to present a new kind of extension to the Caravan project, which does not add new basins (i.e. streamflow data) but rather adds additional weather data for all existing basins. More specifically, we add three additional precipitation nowcast products (CPC, IMERG Early v.0.7, and CHIRPS), and three weather forecast products (ECMWF IFS HRES, GraphCast, and CHIRPS-GEFS). For the ECMWF IFS forecast data, as well as for GraphCast, we include not only precipitation but several land surface variables.

Since not all of this data is available on Earth Engine, we process this data for all existing Caravan gauges, including all extensions. In agreement with the existing data in Caravan, we spatially average all weather data across the catchment area and aggregate to daily resolution. However, since not all data can be easily shifted to local time (as with the original ERA5-Land data in Caravan), we keep all weather products in UTC and therefore also include ERA5-Land in UTC for consistency.

To our knowledge, this extension to Caravan makes it the first large-sample hydrology dataset that includes real weather forecast data. We hope that this extension can be used to enable and empower hydrological research, specifically working on forecasting problems.

How to cite: Shalev, G. and Kratzert, F.: Extending Caravan with additional weather nowcast and forecast products, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-4768, https://doi.org/10.5194/egusphere-egu25-4768, 2025.