era5cli: from a utility script to a reusable software package
- 1Netherlands eScience Center, Delft, Netherlands (b.schilperoort@esciencecenter.nl)
- 2Department of Water Management, Faculty of Civil Engineering and Geosciences, Delft University of Technology, Delft, Netherlands
The ERA5 meteorological reanalysis dataset, from the European Centre for Medium-Range Weather Forecasts (ECMWF), is widely used in areas such as meteorology, hydrology and land-surface modelling. The Copernicus Climate Data Store (CDS) offers two options for accessing the data: a web interface and a Python API. However, automated downloading of the data requires advanced knowledge of Python, and can prove challenging to people less familiar with programming.
Many climate scientists have their own Python scripts to download data from the CDS, all responsible for their own creation and maintenance. A quick search for Python scripts that call the CDS API on GitHub yields 1802 results, and this is not even counting scripts stored privately. However, these are by and large not reusable. A few years ago we created era5cli, as a byproduct of a project we were working on, to try to break this pattern of single-use scripts. era5cli enables automated downloading of ERA5 data using a single command.
It is inefficient that everyone writes their own copy of the same, or at least similar code. That why we asked ourselves whether era5cli is still filling a niche and if so, what we could do to make it easier to re-use for others. In this presentation we give an overview of our recent efforts into turning era5cli from a utility script into a reusable software package.
Despite the relatively small size of era5cli, around 1000 lines of Python code and comments, maintenance is not trivial. Changes are occasionally made to ERA5 and the CDS, and new Python versions are released while old ones are deprecated. Users of era5cli have helped here by submitting fixes to issues they have found in a Github pull request, but still require guidance and/or approval of administrators. By reducing the maintenance load of era5cli, through targeted streamlining of the code and a clean-up of the repository, as well as adding to the developer instructions in the documentation, we lower the threshold for community contributions and successful future maintenance. With this, we aim to make era5cli future-proof.
era5cli can be installed using Python’s pip, as well as using conda/mamba (conda install era5cli -c conda-forge). The source code for era5cli is available on https://github.com/eWaterCycle/era5cli, and the documentation can be found on https://era5cli.readthedocs.io/.
How to cite: Schilperoort, B., Kalverla, P., Vreede, B., Verhoeven, S., Alidoost, F., Liu, Y., Drost, N., Aerts, J., and Hut, R.: era5cli: from a utility script to a reusable software package, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-14302, https://doi.org/10.5194/egusphere-egu23-14302, 2023.