EGU26-4874, updated on 13 Mar 2026
https://doi.org/10.5194/egusphere-egu26-4874
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Wednesday, 06 May, 17:10–17:20 (CEST)
 
Room 0.31/32
Improved return level estimates of cyclone-induced extreme waves by combining extreme value distribution and probabilistic machine learning predictions
Jeremy Rohmer, Andrea G. Filippini, and Rodrigo Pedreros
Jeremy Rohmer et al.
  • BRGM, Risks, Orléans, France (j.rohmer@brgm.fr)

Extreme value theory provides effective approaches and methods for estimating return levels RL (with a typical return period >100 years) of extreme events. However, the lack of sufficiently representative observations to properly fit extreme value distributions (EVDs) is a recurring problem for any metocean engineer in situations where the number of observations is limited or of poor quality [1]. To overcome this problem, augmenting the set of observations with complementary information sources is an interesting option. In this paper, we address this problem by fitting EVDs to both observations and predictions from machine learning models using the approach developed by [2]. By design, however, the predictions of machine learning models are uncertain because they are learned from a limited number of training samples. We therefore propose to explicitly take this error into account when inferring the EVD parameters within an approximate Bayesian computation (ABC) scheme combined with the Wasserstein distance [3].

The added value of this ML approach, which takes prediction uncertainty into account, is shown for cyclone-induced waves in Guadeloupe (Lesser Antilles) using a large database of extreme waves (representative of 1,000 years of cyclonic activity) that were numerically calculated within [4]. A random forest (RF) regression model is trained to link cyclone characteristics (radius, atmospheric pressure, distance to the eye of the hurricane) to significant wave height, and the quantile variant of the RF model is then used to model prediction error within the ABC scheme. Comparison with the 100-year and 500-year RL reference solutions (calculated using the complete database) shows that the ML-based approach results in low bias and high reliability of RL estimates as well as gain in computational efficiency, even when the sample size is reduced by a factor up to 10 and even when the RF prediction error remains moderate with cross-validation coefficient of determination of 70–75%. The benefit of integrating the ML prediction error is shown in different contexts, both along Guadeloupe coasts and in deep ocean environments.

[1] Jonathan et al. (2021). Ocean Engineering, doi:10.1016/j.oceaneng.2020.107725

[2] Rohmer et al. (2023). Ocean Modelling, doi:10.1016/j.ocemod.2023.102275

[3] Bernton et al. (2019). Journal of the Royal Statistical Society Series B, https://doi.org/10.1111/rssb.12312

[4] Interreg Carib-Coast program, https://www.carib-coast.com/en/

How to cite: Rohmer, J., Filippini, A. G., and Pedreros, R.: Improved return level estimates of cyclone-induced extreme waves by combining extreme value distribution and probabilistic machine learning predictions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4874, https://doi.org/10.5194/egusphere-egu26-4874, 2026.