SC57Working with big, multi-dimensional geoscientific datasets in Python: a tutorial introduction to xarray
|Convener: Edward A. Byers | Co-Conveners: Matthew Gidden , Fabien Maussion|
Thu, 27 Apr, 15:30–17:00
Data generation and processing requirements in the physical sciences are growing rapidly, putting pressure not only on computing resources but also scientists’ capabilities for working with medium and big data. Scientists increasingly need software that is open-source, easy to use and scalable for both distributed and high-performance computing.
Python, a general purpose programming language with large popularity and use in the sciences, is increasingly of interest to geoscientists due to its very large user base and support network, cross-platform open-source development and capabilities for both high-performance computing and ease of data processing. This course introduces the software “xarray”, a package developed for Python specifically with the physical sciences in mind.
In this course, we will introduce and demonstrate: the benefits of using Python and its key data-wrangling libraries, pandas and xarray, when working with climate and geoscientific data; working with large, multi-dimensional, labeled datasets in memory and via netCDF; common data operations including indexing, selecting, groupby, and plotting; computation and custom functions; xarray’s capabilities for out-of-core, out-of-memory and parallel computation.
We will work with real climate model data and show how the ease of Python+xarray with Jupyter notebooks has been used to teach masters level climate science students with little programming experience.
This course is open to all and is intended for both experienced and inexperienced programmers.
The intended duration of this course is one time block with approximately 70 participants.
The tools presented in this course:
Jupyter Notebook: http://jupyter.org/
About the organisers:
Edward Byers is is a Postdoctoral Research Scholar at the International Institute for Applied Systems Analysis (Laxenburg, Austria) researching climate change impacts on the energy system (using Python).
Matthew Gidden is a Research Scholar at the International Institute for Applied Systems Analysis (Laxenburg, Austria) and works on spatial modelling of energy systems. He is a trained Software Carpentry instructor.
Fabien Maussion (http://fabienmaussion.info/) is an assistant professor at the University of Innsbruck, Austria. He uses python intensively for his climate science and glaciology courses, and is a member of the xarray package development team.
To follow the course and examples on your own laptop, we recommend a Python installation with xarray, numpy and jupyter notebook packages installed.
You can follow instructions here:
Alternatively download Anaconda or Miniconda https://www.continuum.io/downloads (do this before EGU as the installation files are large ~400MB)
This info will be updated soon!