SC 4.5 | Using distributed databases in your citizen science research
Using distributed databases in your citizen science research
Co-organized by EOS4/ESSI2/GM12/HS11
Convener: Julien Malard-AdamECSECS | Co-conveners: Ankit Agarwal, Wietske Medema, Joel Harms, Johanna Dipple

Database documentation and sharing is a crucial part of the scientific process, and more scientists are choosing to share their data on centralised data repositories. These repositories have the advantage of guaranteeing immutability (i.e., the data cannot change), which is not so amenable to developing living databases (e.g., in continuous citizen science initiatives). At the same time, citizen science initiatives are becoming more and more popular in various fields of science, from natural hazards to hydrology, ecology and agronomy.

In this context, distributed databases offer an innovative approach to both data sharing and evolution. These systems have the distinct advantage of becoming more resilient and available as more users access the same data, and as distributed systems, contrarily to decentralised ones, do not use blockchain technology, they are orders of magnitude more efficient in data storage as well as completely free to use. Distributed databases can also mirror exising data, so that scientists can keep working in their preferred Excel, OpenOffice, or other software while automatically syncing database changes to the distributed web in real time.

This workshop will present the general concepts behind distributed, peer-to-peer systems. Attendees will then be guided through an interactive activity on Constellation, a scientific software for distributed databases, learning how to both create their own databases as well as access and use others' data from the network. Potential applications include citizen science projects for hydrological data collection, invasive species monitoring, or community participation in managing natural hazards such as floods.

Database documentation and sharing is a crucial part of the scientific process, and more scientists are choosing to share their data on centralised data repositories. These repositories have the advantage of guaranteeing immutability (i.e., the data cannot change), which is not so amenable to developing living databases (e.g., in continuous citizen science initiatives). At the same time, citizen science initiatives are becoming more and more popular in various fields of science, from natural hazards to hydrology, ecology and agronomy.

In this context, distributed databases offer an innovative approach to both data sharing and evolution. These systems have the distinct advantage of becoming more resilient and available as more users access the same data, and as distributed systems, contrarily to decentralised ones, do not use blockchain technology, they are orders of magnitude more efficient in data storage as well as completely free to use. Distributed databases can also mirror exising data, so that scientists can keep working in their preferred Excel, OpenOffice, or other software while automatically syncing database changes to the distributed web in real time.

This workshop will present the general concepts behind distributed, peer-to-peer systems. Attendees will then be guided through an interactive activity on Constellation, a scientific software for distributed databases, learning how to both create their own databases as well as access and use others' data from the network. Potential applications include citizen science projects for hydrological data collection, invasive species monitoring, or community participation in managing natural hazards such as floods.