EGU26-20436, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-20436
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Friday, 08 May, 16:40–16:50 (CEST)
 
Room -2.33
Minimising I/O, maximising throughput: earthkit-workflows, a task-graph engine for heterogeneous systems 
Jenny Wong1, Vojtech Tuma2, Harrison Cook1, Corentin Carton de Wiart1, Olivier Iffrig1, James Hawkes1, and Tiago Quintino1
Jenny Wong et al.
  • 1European Centre for Medium Range Weather Forecasts, Reading, United Kingdom
  • 2Oxidian, London, United Kingdom

In-memory HPC workflows promise significant performance gains by reducing I/O, but achieving these gains requires precise scheduling of data-dependent task graphs on heterogeneous computing platforms. While existing Python frameworks such as Dask provide abstractions for parallel execution, they are not designed to fully exploit advanced topology-aware scheduling, natively support tightly coupled CPU-GPU task graphs in complex HPC environments, or utilise captured profiling information during scheduling. 

Earthkit-workflows is a Python library with a declarative API for constructing task graphs, and the capability to schedule and execute them on local or remote resources. It targets heterogeneous environments, enables task-based parallelism across CPUs, GPUs, and distributed HPC or cloud systems. Expensive I/O operations and intermediate storage are minimised via shared memory and high-speed interconnects, allowing intermediate results to be exchanged efficiently during task-graph execution. Streaming outputs from tasks, such as stepwise forecasting, are given first-class support, to allow starting downstream tasks without delay. The library also offers extensible graph-building interface with a plugin mechanism, allowing users to define custom operations, and interoperates seamlessly with the wider earthkit ecosystem. 

The task-graph construction and execution capabilities of earthkit-workflows are being applied in ECMWF’s next generation of data processing frameworks. Individual data processing functions are published as modular and reusable graphs, enriched with profiling measurements, and then combined together to form operational workflows. Two operational workflows which happen to have a subgraph in common, for example two subgraphs retrieving the same data as input, can be automatically merged for efficient resource utilisation. For operational robustness, checkpointing capability is also provided. 

Earthkit-workflows additionally finds application as the core of Forecast-in-a-Box, ECMWF’s offering that combines data-driven weather forecasting models with meteorological product generation, in a manner portable to personal workstation, high power local device or cloud computing, and aimed at non-technical users. Support for GPU is particularly critical, enabling efficient inference for data-driven weather forecasting models, not limited to HPC environments. 

How to cite: Wong, J., Tuma, V., Cook, H., Carton de Wiart, C., Iffrig, O., Hawkes, J., and Quintino, T.: Minimising I/O, maximising throughput: earthkit-workflows, a task-graph engine for heterogeneous systems , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20436, https://doi.org/10.5194/egusphere-egu26-20436, 2026.