EGU26-23287, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-23287
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Tuesday, 05 May, 10:45–12:30 (CEST), Display time Tuesday, 05 May, 08:30–12:30
 
Hall X5, X5.137
Interpretable pollen classification using empirical feature filtering and random forest models on holographic airflow cytometry data
Andreas Schwendimann1, Kilian Koch1, Yanick Zeder1, Erny Niederberger1, and Sophie Erb2
Andreas Schwendimann et al.
  • 1Swisens AG, Meierhofstrasse 5A, CH-6032 Emmen
  • 2MeteoSwiss, Chemin de l'Aérologie, CH-1530 Payerne

Automatic pollen monitoring has become increasingly important for aerobiology, public health, and climate-related 
studies. Across Europe, manual Hirst-type traps are progressively complemented or fully replaced by automatic 
instruments that acquire particle-resolved measurements and apply machine-learning–based classification instead of 
manual light-microscopic identification. This transition enables real-time pollen information but introduces new 
challenges related to data quality, model interpretability, and computational efficiency. 


SwisensPoleno instruments are airflow cytometers that measure individual airborne particles in-flight. Each particle is 
characterized by an array of sensors, including two orthogonal digital holography images, from which morphological 
features are derived. Previous modelling approaches for pollen classification have largely relied on deep learning 
architectures leveraging the full images. While these methods can achieve high accuracy, they are computationally 
expensive to train and evaluate, are prone to overfit for the particular regions where training data was generated and 
exhibit a black-box nature that complicates error analysis and systematic performance improvements. Persistent offseason false positives have thus remained difficult to diagnose and mitigate. 


Here, we present a fast-feedback classification pipeline that combines manual prefiltering of datasets, automatic 
filtering of holography-derived features and a random forest classifier (Figure 1). Prior to model training, datasets are 
manually screened and particles are automatically filtered based on deviations from empirically derived feature 
distributions. This effectively cleans the training datasets and removes non-representative or artefactual samples. The 
resulting training-ready datasets are then used to train random forest models, providing both competitive classification 
performance and full interpretability at the feature level. 


This novel approach leads to significant performance gains compared to previous methods and successfully addresses 
long-standing off-season false-positive issues (Figure 2). Thanks to the reduced specificity when using random forest 
based models in comparison to deep-learning based models, the classification performance has proven to be robust 
comparing 6 different locations in Southern Europe over multiple years. The proposed methodology offers a transparent, 
computationally

How to cite: Schwendimann, A., Koch, K., Zeder, Y., Niederberger, E., and Erb, S.: Interpretable pollen classification using empirical feature filtering and random forest models on holographic airflow cytometry data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23287, https://doi.org/10.5194/egusphere-egu26-23287, 2026.