- Ankara Yildirim Beyazit University, School of Engineering and Natural Sciences, Civil Engineering, Türkiye (augsenocak@aybu.edu.tr)
Precipitation drives the hydrologic cycle and directly impacts sectors from agriculture to electricity generation. However, modeling its statistical distribution is challenging. Precipitation data typically consists of frequent dry days with zero values mixed with rare, extreme events. Both ends of this spectrum can cause disasters, such as flash floods or severe droughts. In the Eastern Mediterranean, this challenge is complicated by complex topography and changing climate patterns. While machine learning (ML) models are widely used for classification or regression of the precipitation, they often treat large areas as uniform regions. However, this generalization misses important local features, such as orographic lifting along mountains or rain shadows in interior basins. Furthermore, most operational models focus only on minimizing error metrics through exact point predictions. Similar to the spatial generalization, this approach yields another problem by ignoring the forecast uncertainty, which is essential for risk-based decision-making.
This study addresses these issues by developing a spatially explicit deep learning framework based on the Probability Integral Transform (PIT). Training models on raw precipitation amounts often leads to underestimating extremes and assigning trace amounts to dry days because machine learning models tend to regress to the mean or the overrepresented classes. To solve this, the target variable (i.e., precipitation based on EOBS data) is transformed into a probability space. Each 0.1-degree pixel is normalized using its own cumulative distribution function (CDF) calculated from the 1985–2015 climatology. Here, instead of a fixed baseline assumption, the Pettitt test is applied to each pixel to detect structural breaks in the historical time series. Yet, this is applied with a condition that at least the last 10 years (2005–2015) are preserved for the CDF analysis, to ensure the approach has enough data. This ensures that the reference climatology reflects the current hydro-climatic conditions.
The deep learning model utilized in this study uses downscaled Global Forecasting System (GFS) forecasts with a 24-hour horizon. To capture the vertical structure of the atmosphere, inputs include wind components (u, v), geopotential height, and specific humidity at 500, 700, and 850 hPa pressure levels. This multi-level approach allows the model to learn the interactions between large-scale circulation, mid-tropospheric moisture transport, and low-level topographical effects. This offers a significant physical advantage over surface-only models. The study covers the period from 2015 to 2025, divided into training (2015–2020), hyperparameter tuning and validation (2020–2022), and testing (2022–2025) sets.
Finally, the deep learning model is extended with conformal prediction to bridge the aforementioned gap between statistical accuracy and yielding exact values. Unlike traditional approaches with a specific error distribution (e.g., Gaussian) assumption, conformal prediction yields distribution-free prediction intervals with a coverage guarantee. This results in adaptive confidence bounds, which can be interpreted with a widened confidence interval during unstable weather patterns and a narrowed one during stable atmospheric conditions. Consequently, the proposed approach ensures that the output is not just a forecast, but a reliable measure of its certainty across the diverse climates and topography of the Eastern Mediterranean.
How to cite: Senocak, A. U. G.: Probabilistic Precipitation Forecasting over the Eastern Mediterranean via PIT-Normalized Conformal Quantile-MOS, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1043, https://doi.org/10.5194/egusphere-egu26-1043, 2026.