Reevaluating Multimodal Approaches To Deep Species Distribution Models

Catherine Villeneuve; Mélisande Teng; Benjamin Akera; Hager Radi Abdelwahed; Robin Zbinden; Laura Pollock; Hugo Larochelle; Devis Tuia; David Rolnick

doi:https://doi.org/10.5194/wbf2026-813

[Back] [Session IND11]

WBF2026-813, updated on 10 Mar 2026

https://doi.org/10.5194/wbf2026-813

World Biodiversity Forum 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Oral | Wednesday, 17 Jun, 09:15–09:30 (CEST)| Room Sanada 1

Reevaluating Multimodal Approaches To Deep Species Distribution Models

Catherine Villeneuve^1,2, Mélisande Teng^2,3, Benjamin Akera^1,2, Hager Radi Abdelwahed², Robin Zbinden⁴, Laura Pollock¹, Hugo Larochelle^2,3, Devis Tuia⁴, and David Rolnick^1,2

Catherine Villeneuve et al.

¹McGill University, Montréal, Canada
²Mila–Quebec AI Institute, Montréal, Canada
³Université de Montréal, Montréal, Canada
⁴Environmental Computational Science and Earth Observation Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland

Recently, deep learning approaches to species distribution models (SDMs) have increasingly focused on integrating information-rich modalities such as natural language and remote sensing, motivated by the hypothesis that capturing the non-linear relationships between these inputs and species occurrences should help compensate for limited biodiversity data in poorly monitored regions. However, while leveraging additional modalities has been shown to improve predictions in certain settings, we argue that these improvements remain highly dependent on the task formulation and dataset. We consider the SatBird dataset (Teng et al., 2023) as an illustrative example, showing how leveraging representations derived from satellite imagery does not consistently translate into performance improvements, especially in low-data regimes. We argue that multimodality shouldn't be treated as a generic stepping stone towards improving deep learning-based SDMs, as it can often boil down to the naive assumption that any additional information will be beneficial regardless of their ecological relevance. We also highlight that multimodal approaches in deep learning-based SDMs are predominantly reducible to the inclusion of more and more abiotic covariates, and discuss how such a strategy can amplify the risk of overfitting to sampling biases and amplifying spurious correlations. Finally, we show that leveraging relevant, context-dependent biotic information offers a particularly promising alternative research direction, considering as case studies our work with 1) BATIS (Villeneuve et al., 2026), a novel Bayesian framework that iteratively refines prior predictions from an uncertainty-aware SDM using limited local observations in data-scarce regions, and 2) CISO (Abdelwahed et al., 2025), a novel transformer-based approach that leverages well-documented species groups to improve predictions for data-limited taxa. Results with both BATIS and CISO suggest that universal solutions are unlikely to be sufficient to address current limitations in deep learning-based SDMs, and that further improvements in predictive performance are more likely to come from targeted approaches dedicated to specific data gaps and ecological contexts.

How to cite: Villeneuve, C., Teng, M., Akera, B., Radi Abdelwahed, H., Zbinden, R., Pollock, L., Larochelle, H., Tuia, D., and Rolnick, D.: Reevaluating Multimodal Approaches To Deep Species Distribution Models, World Biodiversity Forum 2026, Davos, Switzerland, 14–19 Jun 2026, WBF2026-813, https://doi.org/10.5194/wbf2026-813, 2026.