Find the EGU on

Tag your tweets with #EGU17

Please note that this session was withdrawn and is no longer available in the respective programme. This withdrawal might have been the result of a merge with another session.


Reproducible research for big data in practice (co-organized)
Convener: Edzer Pebesma  | Co-Conveners: Kerstin Lehnert , Jens Klump , Martin Hammitzsch , Daniel Nüst 
Reproducibility is important for science. The topic gains more attention each year with prominent papers, editorials and blog posts. Journals, researchers and funders drive forward the agenda on many aspects of openness: open access, open science, open data, open source. But still the vast majority of papers analyzing any kind of data is not accompanied by data, code and documentation that let you easily reproduce the calculations that underly the paper. This session will showcase papers that focus on big data analysis and take reproducibility and openness into account. It is open to members of all programme groups and scientific disciplines to present how they conduct data-based research in a reproducible way. They are welcome to share practical advice, lessons learned, practical challenges of reproducibility, and report on the application of tools and software that support computational reproducibility. Computational reproducibility is especially important in the context of big data. Readers of articles must be able to trust the applied methods and computations because the huge amount of observations that are collected not only create a huge volume of data, but very often these data are also unique, observed by a single entity, or synthetic and simulated. Contributions based on small datasets are of special interest to demonstrate the variety in big data. Topics may include, but are not limited to, reproducibility reports and packages for previously published computational research, practical evaluations of reproducibility solutions for a specific research use case, best practices towards reproducibility in a specific domain such as publishing guidelines for data and code, or experiences from teaching methods for computational reproducibility.