EGU26-7107, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-7107
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Friday, 08 May, 16:15–18:00 (CEST), Display time Friday, 08 May, 14:00–18:00
 
Hall X4, X4.94
Automated workflows for ever-growing, analysis-ready datasets at the Barbados Cloud Observatory
Rowan Orlijan-Rhyne, Lukas Kluft, and Tobias Kölling
Rowan Orlijan-Rhyne et al.
  • Max-Planck-Institut für Meteorologie, Hamburg, DE (rowan.orlijan-rhyne@mpimet.mpg.de)

The Barbados Cloud Observatory (BCO), in continuous operation by the Max Planck Institute for Meteorology, offers an extensive record of clouds in the trade wind region since its birth in 2010. In the form of public, analysis-ready zarr stores processed with automated workflows, the record can be studied at time scales from seconds to years and serves to drive theoretical and model advancements. As an important geoscientific research asset, data from the BCO is trustable, reproducible, and versioned, but also easily available.

BCO data processing employs Apache Airflow’s automated workflows which append to zarr stores whenever new data arrives. Management of dynamic and growing datasets—as opposed to static (e.g. campaign) datasets—permits many versions, all of which are accurate and can be automatically regenerated. In shepherding the data, we choose our own unique keys, including dataset version numbering, which make up an intake catalog. We also implement quality control of dataset metadata and encodings with in-house tools.

By allowing for rolling processing of the data, often at daily intervals, our products can be easily probed for scientific, technical, and other use. For instance, we develop a javascript viewer which allows users to quickly and easily visualize data from many instruments. Additionally, by providing raw (i.e. directly from the instrument, as format permits), time-aggregated, commonly gridded, and sitewide 'best estimate' datasets, we also iterate on levels of processing complexity for a host of needs. These usability advantages are consequences of our technical approach, namely automated workflows and analysis-ready zarr stores.

How to cite: Orlijan-Rhyne, R., Kluft, L., and Kölling, T.: Automated workflows for ever-growing, analysis-ready datasets at the Barbados Cloud Observatory, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7107, https://doi.org/10.5194/egusphere-egu26-7107, 2026.