Addressing data fragmentation in biodiversity citizen science data: Pipelines for integrated species distribution Models&nbsp;

Anders Finstad; Sam Perin; Philip Mostert; Kwaku Adjei; Ron Togunov; Bob O'Hara

doi:https://doi.org/10.5194/wbf2026-643

[Back] [Session IND5]

WBF2026-643, updated on 22 Apr 2026

https://doi.org/10.5194/wbf2026-643

World Biodiversity Forum 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Oral | Wednesday, 17 Jun, 11:45–12:00 (CEST)| Room Jakobshorn

Addressing data fragmentation in biodiversity citizen science data: Pipelines for integrated species distribution Models

Anders Finstad¹, Sam Perin¹, Philip Mostert¹, Kwaku Adjei², Ron Togunov¹, and Bob O'Hara

Anders Finstad et al.

¹Norwegian University of Science and Technology
²Norwegian Institute for Nature Research

Fragmented datasets, sampling bias and inconsistent observation protocols often limit the use of citizen science data for indicator development. Citizen science data are often collected opportunistically without a design for use in biodiversity metrics. However, the large volume of data, and the broad spatial and taxonomic coverage, provide an invaluable source of biodiversity information at scale.

Here, we present a pipeline that integrates heterogeneous datasets to generate large scale maps of biodiversity metrics. These maps form a basis for management relevant information tools. We apply integrated species distribution modelling (iSDM) to correct for sampling bias and differences in data collection methods. We use the large number of open datasets available through aggregators such as GBIF.

The workflow has four main steps. These are data acquisition, data integration, integrated species distribution modelling (iSDM) and the production of derived outputs. Input data include structured surveys, opportunistic observations and environmental covariates. We standardise these inputs and combine them in a common iSDM framework. This produces species intensity maps, associated uncertainty estimates and sampling effort maps. We further process these outputs to identify biodiversity hotspots and to summarise species environment relationships.

We use Norway as a case study. Norway has extensive opportunistic citizen science programs. We produced detailed maps of species richness, biodiversity hotspots, uncertainty and sampling intensity. Our results show the potential of pipelines that integrate disparate datasets. Our example also reveals important limitations in the current body of data. In particular, it is not possible to infer and correct for sampling bias without data that allow estimation of the probability of occurrence. In practice this means data that include information on both what was observed and what was not observed. Our study therefore demonstrates a clear need to incorporate more structured approaches into citizen science data. This should not replace opportunistic, curiosity driven activity. It should add to it and support both the large data volumes and the high level of public engagement.

How to cite: Finstad, A., Perin, S., Mostert, P., Adjei, K., Togunov, R., and O'Hara, B.: Addressing data fragmentation in biodiversity citizen science data: Pipelines for integrated species distribution Models , World Biodiversity Forum 2026, Davos, Switzerland, 14–19 Jun 2026, WBF2026-643, https://doi.org/10.5194/wbf2026-643, 2026.