- Max-Planck-Institut für Meteorologie, Hamburg, DE (rowan.orlijan-rhyne@mpimet.mpg.de)
The Barbados Cloud Observatory (BCO), in continuous operation by the Max Planck Institute for Meteorology, offers an extensive record of clouds in the trade wind region since its birth in 2010. In the form of public, analysis-ready zarr stores processed with automated workflows, the record can be studied at time scales from seconds to years and serves to drive theoretical and model advancements. As an important geoscientific research asset, data from the BCO is trustable, reproducible, and versioned, but also easily available.
BCO data processing employs Apache Airflow’s automated workflows which append to zarr stores whenever new data arrives. Management of dynamic and growing datasets—as opposed to static (e.g. campaign) datasets—permits many versions, all of which are accurate and can be automatically regenerated. In shepherding the data, we choose our own unique keys, including dataset version numbering, which make up an intake catalog. We also implement quality control of dataset metadata and encodings with in-house tools.
By allowing for rolling processing of the data, often at daily intervals, our products can be easily probed for scientific, technical, and other use. For instance, we develop a javascript viewer which allows users to quickly and easily visualize data from many instruments. Additionally, by providing raw (i.e. directly from the instrument, as format permits), time-aggregated, commonly gridded, and sitewide 'best estimate' datasets, we also iterate on levels of processing complexity for a host of needs. These usability advantages are consequences of our technical approach, namely automated workflows and analysis-ready zarr stores.
How to cite: Orlijan-Rhyne, R., Kluft, L., and Kölling, T.: Automated workflows for ever-growing, analysis-ready datasets at the Barbados Cloud Observatory, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7107, https://doi.org/10.5194/egusphere-egu26-7107, 2026.