- 1Ghent University - Hydro-Climate Extremes Lab, Ghent, Belgium
- 2European Space Agency, Frascati, Italy
The Global Land Evaporation Amsterdam Model (GLEAM) estimates daily land evaporation using a wide range Earth observation forcing datasets. In the project GLEAM-HR funded by the European Space Agency (ESA), we aim to create a global high-resolution daily evaporation dataset at 1 km for a period of eight years (2016–2023). To produce high-resolution evaporation estimates, all forcing data must be processed at 1 km resolution, requiring substantial computational resources. As the complete high-resolution forcing data no longer fits within the memory capacity of single HPC nodes, parallelization tools are necessary. To achieve this parallelization in a seamless way, a workflow orchestration ecosystem is designed that leverages the use of Zarr, Apptainer and Nextflow.
The Zarr ecosystem allows for easily writing to a dataset in parallel. Nextflow is an orchestration tool that allows dynamic job submissions, where the configuration of jobs can depend on the outcome of earlier jobs, such as the spatial domain to be processed. Apptainer is a containerization tool developed for HPC environments, allowing a “build once, deploy anywhere” approach. Combining these tools allows building a workflow orchestration environment that enables the automation of these parallel workflows while optimizing the job sizes for a given HPC environment.
The use of containers allows this workflow to be ported to different hardware without the need to set up all the environments again, making the designed workflow fully reproducible independent of the computing environment. Combining this with Continuous Integration and Continuous Delivery (CI/CD) practices to automate the container building and deployment, code development and workflow execution can be cleanly separated.
In a first test case, this processing workflow is used to produce global datasets of LAI, FPAR and vegetation cover fractions at 1 km resolution. Future work focuses on the extension of this workflow to the other forcing datasets and the entire pipeline execution.
How to cite: Massant, J., Baez-Villanueva, O., Delbaere, K., Fernandez Prieto, D., and Miralles, D.: Parallel HPC workflow orchestration with Nextflow, supported by CI/CD and containerization tools for global high resolution evaporation modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5728, https://doi.org/10.5194/egusphere-egu26-5728, 2026.