- Yale University
Reliable biodiversity forecasts depend on knowing where species occur and how they interact with their environment. Yet most of the world’s species—particularly those that are rare, range-restricted, or threatened—remain severely under-sampled. These data gaps cascade through global biodiversity assessments, systematically excluding data-deficient species from conservation prioritization, climate vulnerability analyses, and scenario planning. As countries work toward the goals of the Kunming–Montreal Global Biodiversity Framework, the persistent invisibility of data-deficient species has become a major, but often unrecognized, barrier to evidence-based action.
We present a new predictive modeling framework that directly addresses this challenge by allowing species with few records to “borrow strength” from their closest relatives. Because related species tend to share aspects of their ecological niches, evolutionary history offers a powerful source of information where empirical data are lacking. By embedding phylogenetic relationships directly into a multi-species distribution model, we are able to generate robust environmental niche estimates and spatial predictions even for species with only 1–10 observations.
Using a continental-scale analysis of South American vertebrates, I show that phylogenetically informed models dramatically outperform traditional species distribution models under extreme data scarcity. The greatest gains occur for data-deficient species, whose predicted distributions and accuracy metrics improve substantially compared to standard state-of-the-art SDMs. As species become data-sufficient, model performance between approaches converges, highlighting that the primary value of phylogenetic information lies in rescuing the species we understand the least.
Crucially, this work lays the foundation for better-informed conservation decisions by quantifying how the exclusion of data-deficient species biases conservation analyses and alters our understanding of global change impacts, particularly under climate change scenarios and climate-vulnerability assessments. By making it possible to include under-sampled, understudied, and often highly threatened species in scenario planning and priority-setting exercises, this framework enables more inclusive and equitable conservation outcomes. Overall, this framework extends species distribution modeling to under-sampled taxa, reducing a major source of bias in ecological and conservation analyses.
How to cite: Sharma, S., Cohen, J., and Jetz, W.: Mapping All Species: Closing Biodiversity Data Gaps with Phylogenetically Informed Predictive Models , World Biodiversity Forum 2026, Davos, Switzerland, 14–19 Jun 2026, WBF2026-903, https://doi.org/10.5194/wbf2026-903, 2026.