A novel data ecosystem for coastal analyses
- 1Deltares, Boussinesqweg 1, 2629 HV Delft, The Netherlands (floris.calkoen@deltares.nl)
- 2Deltares, Boussinesqweg 1, 2629 HV Delft, The Netherlands
- 3Department of Hydraulic Engineering, Faculty of Civil Engineering and Geosciences, Delft University of Technology, P.O. Box 5048, 2600 GA Delft, The Netherlands
The coastal community widely anticipates that in the next years data-driven studies are going to make essential contributions to bringing about long-term coastal adaptation and mitigation strategies at continental scale. This view is also supported by CoCliCo, a Horizon 2020 project, where coastal data form the fundamental building block for an open-web portal that aims to improve decision making on coastal risk management and adaptation. The promise of data is likely triggered by several coastal analyses that showed how the coastal zone can be be monitored at unprecedented spatial scales using geospatial cloud platforms . However, we note that when analyses become more complex, i.e., require specific algorithms, pre- and post-processing or include data that are not hosted by the cloud provider, the cloud-native processing workflows are often broken, which makes analyses at continental scale impractical.
We believe that the next generation of data-driven coastal models that target continental scales can only be built when: 1) processing workflows are scalable; 2) computations are run in proximity to the data; 3) data are available in cloud-optimized formats; 4) and, data are described following standardized metadata specifications. In this study, we introduce these practices to the coastal research community by showcasing the advantages of cloud-native workflows by two case studies.
In the first example we map building footprints in areas prone to coastal flooding and estimate the assets at risk. For this analysis we chunk a coastal flood-risk map into several tiles and incorporate those into a coastal SpatioTemporal Asset Catalog (STAC). The second example benchmarks instantaneous shoreline mapping using cloud-native workflows against conventional methods. With data-proximate computing, processing time is reduced from the order of hours to seconds per shoreline km, which means that a highly-specialized coastal mapping expedition can be upscaled from regional to global level.
The analyses mostly rely on "core-packages" from the Pangeo project, with some additional support for scalable geospatial data analysis and cloud I/O, although they can essentially be run on a standard Python Planetary Computer instance. We publish our code, including self-explanatory Juypter notebooks, at https://github.com/floriscalkoen/egu2023.
To conclude, we foresee that in next years several coastal data products are going to be published, of which some may be considered "big data". To incorporate these data products into the next generation of coastal models, it is urgently required to agree upon protocols for coastal data stewardship. With this study we do not only want to show the advantages of scalable coastal data analysis; we mostly want to encourage the coastal research community to adopt FAIR data management principles and workflows in an era of exponential data growth.
How to cite: Calkoen, F., Baart, F., Kras, E., and Luijendijk, A.: A novel data ecosystem for coastal analyses, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-15964, https://doi.org/10.5194/egusphere-egu23-15964, 2023.