EGU26-6702, updated on 13 Mar 2026
https://doi.org/10.5194/egusphere-egu26-6702
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 04 May, 15:15–15:25 (CEST)
 
Room -2.92
PARIS: Pruning Algorithm via the Representer theorem for Imbalanced Scenarios
Enrico Camporeale1,2
Enrico Camporeale
  • 1Queen Mary University of London, London, UK (enrico.camporeale@qmul.ac.uk)
  • 2University of Colorado, SWx-TREC, Boulder, United States of America

Accurate prediction of rare but high-impact events is a recurring challenge in planetary science and heliophysics, where strongly imbalanced data distributions are common (e.g. extreme space-weather conditions). Standard empirical risk minimization tends to bias machine-learning models toward frequently observed regimes, often leading to poor performance on scientifically and operationally critical tail events. Existing mitigation strategies, such as loss re-weighting or synthetic over-sampling, have shown mixed and problem-dependent success.

We present PARIS (Pruning Algorithm via the Representer theorem for Imbalanced Scenarios), a data-centric framework that addresses imbalance by optimizing the training dataset itself rather than modifying the loss function or model architecture. PARIS exploits the representer theorem for neural networks to compute a closed-form representer deletion residual, which quantifies the change in validation loss induced by removing an individual training sample—without requiring retraining. Using an efficient Cholesky rank-one downdating scheme, this enables fast, iterative pruning of uninformative or performance-degrading samples.

We demonstrate PARIS on a real-world space-weather regression problem (Dst prediction), where it reduces the training set by up to 75% while preserving or improving overall RMSE and outperforming loss re-weighting, synthetic over-sampling, and boosting baselines. These results highlight representer-guided dataset pruning as a computationally efficient, interpretable, and physically relevant approach for improving rare-event regression in heliophysics and related planetary science applications.

Preprint: https://www.arxiv.org/abs/2512.06950

How to cite: Camporeale, E.: PARIS: Pruning Algorithm via the Representer theorem for Imbalanced Scenarios, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6702, https://doi.org/10.5194/egusphere-egu26-6702, 2026.