EGU24-11774, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-11774
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

An end-to-end workflow for climate data management and analysis integrating HPC, Big Data and Machine Learning 

Alessandro D'Anca1, Sonia Scardigno1, Jorge Ejarque2, Gabriele Accarino1, Daniele Peano1, Francesco Immorlano1, Davide Donno1, Enrico Scoccimarro1, Rosa M. Badia2, and Giovanni Aloisio1
Alessandro D'Anca et al.
  • 1Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC), Lecce, Italy
  • 2Barcelona Supercomputing Center (BSC), Barcelona, Spain

The advances in Earth System Models (ESM), jointly with the availability of more powerful computing infrastructures and novel solutions for Big Data and Machine Learning (ML) is allowing to push research in the climate change field forward. In such context, workflows are fundamental tools to automate the complex processes of model simulations, data preparation and analyses. Such tools are becoming more important as the complexity and heterogeneity in the software and computing infrastructures, as well as the data volumes to be handled, grow. However, integrating into a single workflow simulation and data centric processes can be very challenging due to their different requirements.
This work presents an end-to-end workflow including the steps from the numerical ESM simulation run to the analysis of extreme weather events (e.g., heat waves and tropical cyclones) developed in the context of the eFlows4HPC EuroHPC project. It represents a real case study which requires components from High Performance Computing (HPC), Big Data and ML to carry out the workflow. In particular, the contribution demonstrates how the eFlows4HPC software stack can simplify the development, deployment, orchestration and execution of complex end-to-end workflows for climate science, as well as improve their portability over different computing infrastructures.

How to cite: D'Anca, A., Scardigno, S., Ejarque, J., Accarino, G., Peano, D., Immorlano, F., Donno, D., Scoccimarro, E., Badia, R. M., and Aloisio, G.: An end-to-end workflow for climate data management and analysis integrating HPC, Big Data and Machine Learning , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11774, https://doi.org/10.5194/egusphere-egu24-11774, 2024.

Supplementary materials

Supplementary material file

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 16 Apr 2024, no comments