- 1Royal Meteorological Institute of Belgium, Brussels, Belgium
- 2Ghent University, Ghent, Belgium
- 3University of Antwerp, Antwerp, Belgium
This study proposes methods of wind power predictions from Numerical Weather Prediction (NWP) models and evaluates wind power ramping event predictions in the Belgian Offshore Zone. We verify the operational deterministic model ALARO-4km at the Royal Meteorological Institute of Belgium, its enhanced version incorporating Wind Farm Parameterization (WFP), and the ECMWF ensemble prediction system. To convert meteorological variables into power forecasts, we implement both physical power curves and machine learning methods, including XGBoost and Transformer models. Within the machine learning models, we over-sample rare but high-impact events such as turbine cut-outs during high wind speeds, enabling the models to effectively learn these critical extreme states. While initial validation using traditional metrics suggests that the Transformer model achieves the lowest Mean Absolute Error (MAE) for deterministic and Continuous Ranked Probability Score (CRPS) for probabilistic, we argue that these aggregate scores may mask deficiencies in the capture of rapid power fluctuations, which is vital for stable grid operations.
Since ramping events pose challenges to power system operations, we further verify the capability of these models to predict significant ramps. We highlight the limitations of standard metrics like MAE and CRPS, as they often optimize average timing and magnitude errors in a way that rewards "over-smoothing", even though such smoothing renders the forecast ineffective for detecting ramps. To overcome this, we propose a verification framework that introduces an error buffer for deterministic contingency analysis (hits, misses, and false alarms) and adapts this buffer concept for probabilistic verification within the Brier Score. We apply these proposed verification solutions to our power model outputs and evaluate the models' useful skills. In deterministic forecasting, the XGBoost model achieves higher scores for most ramping events compared to other models, whereas the power curve approach proves more effective for capturing large-scale ramps within the ensemble-based probabilistic predictions. Our results demonstrate that the Transformer’s low CRPS is largely a result of its smoothed output, which is unfavourable for predicting actual ramping events. These findings emphasize the need for operational caution when identifying "optimal" models, suggesting that lower scores in average error metrics do not inherently guarantee reliability for managing critical power ramps. Our proposed verification solutions provide an intuitive framework for understanding and comparing the predictive skill of various models specifically regarding ramping events.
How to cite: Meng, R., Smet, G., Van den Bergh, J., Tabari, H., Van den Bleeken, D., and Termonia, P.: Evaluating the Operational Skill of Deterministic and Probabilistic Wind Power Ramping Event Predictions for the Belgian Offshore Zone, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12727, https://doi.org/10.5194/egusphere-egu26-12727, 2026.