- Universidad de Zaragoza, IUMA, Statistical Methods, Spain (e.barrio@unizar.es)
Signs of global warming are evident in extreme daily maximum temperature events Tx , especially those that break historical records. In the Iberian Peninsula, [Castillo-Mateo, 2023] demonstrated that the frequency of such records shows a trend surpassing what is expected under stationary conditions and varies spatially. A novel approach is introduced here to analyse and interpret the spatial variability and spatio-temporal patterns of this phenomenon.
Daily TX data spanning 1960–2023 from 36 Spanish stations were obtained from the European Climate Assessment & Dataset. Geopotential variables at 12 p.m. for pressure levels of 300, 500, and 700 hPa, on a 1o x 1o grid covering [45oN, 10oW, 35oS, 5oE], were sourced from ERA5 reanalysis data as the predictor database. The analysis focused on summer (JJA) days.
An algorithm was used to derive an optimal model for each station using logistic regression, along with several global models. The target variable was defined as a binary indicator of daily threshold exceedance for Tx . For each station s, the threshold was determined as the 95th percentile of maximum temperatures during the reference period 1981–2010, specifically for the summer months (June, July, and August). Mathematically, the threshold for station s is expressed as us = Q 0.95 (Tx s,t,l t ∈ [1981, 2010] , l ∈ [1, 92]) where Tx s,t,l denotes the maximum temperature at station s for year t and day l, with l corresponding to the summer days. The binary indicator is defined as I s,t,l = 1 if Tx s,t,l > us , and 0 otherwise.
The series of geopotential covariates at the grid points corresponding to the four farthest corners, as well as the closest grid point to each station, were used as predictors. These variables were further expanded by including a lag and their second-order polynomial terms. The algorithm involved multiple steps; 1) Stepwise regression was employed at each station to identify optimal predictors; 2)The most significant and frequently selected predictor variables from these models were then used to construct a global model. 3)Three interaction models were developed by introducing interactions between the selected predictors and geodesic, climatic, and spatial factors, followed by stepwise regression. Data from the first 51 years were used for training, while the last 13 years were reserved for testing. To address class imbalance, the AUC was used as a measure of model performance.
The simplest global model demonstrated strong overall performance with an AUC of 0.88 and k = 15 parameters, though it exhibited lower scores for stations located near the coast. Notable improvements in coastal station AUC values were achieved in the three interaction models. The model including climatic interactions achieved an AUC of 0.89 with k=34 parameters. The model with climatic interactions was selected as the most top performer.
In conclusion, we analyzed extreme maximum temperature events in the Iberian Peninsula using station-specific and global models with geopotential predictors. Interaction models improved performance, particularly for coastal stations, with the climatic interaction model achieving the best balance of accuracy and simplicity.
How to cite: Barrio, E., Gracia-Tabuenca, Z., Asín, J., Abaurrea, J., Castillo, J., and Cebrián, A.: Modelling a Binary Threshold Indicator for Maximum Temperatures with a Selection Algorithm for Spatio-Temporal Models, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20561, https://doi.org/10.5194/egusphere-egu25-20561, 2025.