Recent adoption of Open Data policies and investments towards Cloud-based platforms have attracted a growing number of consumers of ECMWF data. An example of these initiatives is the European Weather Cloud (EWCloud), where users wish to run automated, real-time tasks or workflows closer to the latest data produced by the model run in the ECMWF HPC facility, thus avoiding costly data transfers out of the data centre. This trend is likely to increase together with the exponential growth of weather forecast data volume. It is expected that in the next few years, taking into account resolution upgrades and more complex model physics, the raw forecast data will exceed a petabyte per day. From an operational perspective, this convergence in the use of HPC and cloud infrastructures is dependent on timely synchronisation with the forecast schedule. A mechanism is needed to notify the consumers of specific data availability in a scalable manner and provide the capability to automatically trigger their workflows based on this data.
To accomplish this, we are developing a system, named "Aviso"1, designed to notify of availability of real-time forecast data or derived products, and to trigger user-defined workflows in automatically. End-users can build their workflow based on events, using a When <this>... Do <that> logic directly linked to ECMWF metadata semantics. The system is composed of a server application based on a persistent key-value store, leveraging modern technologies such as etcd, to provide consistency, transactionality, reliability and scalability to the end-users. The client side is a lightweight Python application providing a CLI interface as well as a Python API for easy integration in the users’ workflows. Finally, the notifications can be exchanged using CloudEvents messages; allowing workflows that span across multiple data centres and cloud-based infrastructures.
This presentation will show how to leverage Aviso for scheduling weather-related workflows in the context of European Horizon 2020 projects (LEXIS and HiDALGO). The LEXIS project focuses on how HPC and cloud systems interact to enable complex workflows, and is demonstrating this concept through three large-scale socio-economic pilots, targeting aeronautics, weather & climate, and catastrophe alert systems. The HiDALGO project focuses on improving data-centric, on-demand computational modelling workflows for accurate policy-making in the domain of Global Challenges, such as human migration, urban air pollution, COVID-19 pandemic and malicious information in social media. Aviso is also a component of the ECMWF's Scalability Programme, and is being introduced in pre-operational status to ensure scalable data availability notification to data consumers.
This work has received funding from the European Union’s H2020 research and innovation programme under grant agreements number 825532 and 824115.
Footnotes:
1) "Aviso" means 'notification' in multiple Latin based languages
How to cite: Iacopino, C., Hawkes, J., Quintino, T., and Raoult, B.: NWP Data availability notifications for meteorological workflows across HPC and Cloud data centres, EMS Annual Meeting 2021, online, 6–10 Sep 2021, EMS2021-17, https://doi.org/10.5194/ems2021-17, 2021.