Advances in Earthquake Forecasting and Model Testing

Vetted probabilistic earthquake forecasts can contribute to more earthquake-resilient societies. Forecasts underpin seismic hazard assessments and thus determine building and life safety. They also provide scientifically sound information about the time-dependence of earthquake potential before, during and after earthquake sequences. To ensure forecasts are trustworthy and to assess the scientific hypotheses underlying the forecasts, models should be tested both retrospectively and prospectively (i.e., against yet-to-be-collected data). For this purpose, the Collaboratory for the Study of Earthquake Predictability (CSEP) provides tools and methods for testing the consistency and precision of earthquake forecasts. This session welcomes contributions that showcase advances in the science of earthquake forecasting and model testing. These can include: new approaches for identifying precursory activity (e.g. b-value variations, aseismic slip transients); forecasts based on empirical machine-learning or physical stress-transfer algorithms; applications of models to earthquake sequences around the globe; advances in model evaluation techniques; or contributions to software tools for model developers. Presentations may also highlight progress of community efforts, such as the EU H2020 project RISE (Real-time earthquake rIsk reduction for a reSilient Europe, and other initiatives.

Co-organized by ESSI1/NH4
Convener: Maximilian Werner | Co-conveners: Warner Marzocchi, Danijel Schorlemmer
vPICO presentations
| Thu, 29 Apr, 13:30–14:15 (CEST)

vPICO presentations: Thu, 29 Apr

Chairpersons: Maximilian Werner, Warner Marzocchi, Danijel Schorlemmer
Shubham Sharma, Sebastian Hainzl, Gert Zöller, and Matthias Holschneider

The Coulomb failure stress (CFS) criterion is the most commonly used method for predicting spatial distributions of aftershocks following large earthquakes. However, large uncertainties are always associated with the calculation of Coulomb stress change. The uncertainties mainly arise due to nonunique slip inversions and unknown receiver faults; especially for the latter, results are highly dependent on the choice of the assumed receiver mechanism. Based on binary tests (aftershocks yes/no), recent studies suggest that alternative stress quantities, a distance‐slip probabilistic model as well as deep neural network (DNN) approaches, all are superior to CFS with predefined receiver mechanism. To challenge this conclusion, which might have large implications, we use 289 slip inversions from SRCMOD database to calculate more realistic CFS values for a layered half‐space and variable receiver mechanisms. We also analyze the effect of the magnitude cutoff, grid size variation, and aftershock duration to verify the use of receiver operating characteristic (ROC) analysis for the ranking of stress metrics. The observations suggest that introducing a layered half‐space does not improve the stress maps and ROC curves. However, results significantly improve for larger aftershocks and shorter time periods but without changing the ranking. We also go beyond binary testing and apply alternative statistics to test the ability to estimate aftershock numbers, which confirm that simple stress metrics perform better than the classic Coulomb failure stress calculations and are also better than the distance‐slip probabilistic model.

How to cite: Sharma, S., Hainzl, S., Zöller, G., and Holschneider, M.: Is Coulomb Stress the Best Choice for Aftershock Forecasting?, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-5172,, 2021.

Jose A. Bayona, William Savran, Maximilian Werner, and David A. Rhoades

Developing testable seismicity models is essential for robust seismic hazard assessments and to quantify the predictive skills of posited hypotheses about seismogenesis. On this premise, the Regional Earthquake Likelihood Models (RELM) group designed a joint forecasting experiment, with associated models, data and tests to evaluate earthquake predictability in California over a five-year period. Participating RELM forecast models were based on a range of geophysical datasets, including earthquake catalogs, interseismic strain rates, and geologic fault slip rates. After five years of prospective evaluation, the RELM experiment found that the smoothed seismicity (HKJ) model by Helmstetter et al. (2007) was the most informative. The diversity of competing forecast hypotheses in RELM was suitable for combining multiple models that could provide more informative earthquake forecasts than HKJ. Thus, Rhoades et al. (2014) created multiplicative hybrid models that involve the HKJ model as a baseline and one or more conjugate models. Particularly, the authors fitted two parameters for each conjugate model and an overall normalizing constant to optimize each hybrid model. Then, information gain scores per earthquake were computed using a corrected Akaike Information Criterion that penalized for the number of fitted parameters. According to retrospective analyses, some hybrid models showed significant information gains over the HKJ forecast, despite the penalty. Here, we assess in a prospective setting the predictive skills of 16 hybrids and 6 original RELM forecasts, using a suite of tests of the Collaboratory for the Study of Earthquake Predicitability (CSEP). The evaluation dataset contains 40 M≥4.95 events recorded within the California CSEP-testing region from 1 January 2011 to 31 December 2020, including the 2016 Mw 5.6, 5.6, and 5.5 Hawthorne earthquake swarm, and the Mw 6.4 foreshock and Mw 7.1 mainshock from the 2019 Ridgecrest sequence. We evaluate the consistency between the observed and the expected number, spatial, likelihood and magnitude distributions of earthquakes, and compare the performance of each forecast to that of HKJ. Our prospective test results show that none of the hybrid models are significantly more informative than the HKJ baseline forecast. These results are mainly due to the occurrence of the 2016 Hawthorne earthquake cluster, and four events from the 2019 Ridgecrest sequence in two forecast bins. These clusters of seismicity are exceptionally unlikely in all models, and insufficiently captured by the Poisson distribution that the likelihood functions of tests assume. Therefore, we are currently examining alternative likelihood functions that reduce the sensitivity of the evaluations to clustering, and that could be used to better understand whether the discrepancies between prospective and retrospective test results for multiplicative hybrid forecasts are due to limitations of the tests or the methods used to create the hybrid models. 

How to cite: Bayona, J. A., Savran, W., Werner, M., and Rhoades, D. A.: Prospective Evaluation of Multiplicative Hybrid Earthquake Forecast Models for California, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-8775,, 2021.

Karina Loviknes, Danijel Schorlemmer, Fabrice Cotton, and Sreeram Reddy Kotha

Non-linear site effects are mainly expected for strong ground motions and sites with soft soils and more recent ground-motion models (GMM) have started to include such effects. Observations in this range are, however, sparse, and most non-linear site amplification models are therefore partly or fully based on numerical simulations. We develop a framework for testing of non-linear site amplification models using data from the comprehensive Kiban-Kyoshin network in Japan. The test is reproducible, following the vision of the Collaboratory for the Study of Earthquake Predictability (CSEP), and takes advantage of new large datasets to evaluate whether or not non-linear site effects predicted by site-amplification models are supported by empirical data. The site amplification models are tested using residuals between the observations and predictions from a GMM based only on magnitude and distance. When the GMM is derived without any site term, the site-specific variability extracted from the residuals is expected to capture the site response of a site. The non-linear site amplification models are tested against a linear amplification model on individual well-recording stations. Finally, the result is compared to building codes where non-linearity is included. The test shows that for most of the sites selected as having sufficient records, the non-linear site-amplification models do not score better than the linear amplification model. This suggests that including non-linear site amplification in GMMs and building codes may not yet be justified, at least not in the range of ground motions considered in the test (peak ground acceleration < 0.2 g).

How to cite: Loviknes, K., Schorlemmer, D., Cotton, F., and Reddy Kotha, S.: Testing non-linear amplification factors used in ground motion models, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-4829,, 2021.

Ester Manganiello, Marcus Herrmann, and Warner Marzocchi

The ability to forecast large earthquakes on short time scales is strongly limited by our understanding of the earthquake nucleation process. Foreshocks represent promising seismic signals that may improve earthquake forecasting as they precede many large earthquakes. However, foreshocks can currently only be identified as such after a large earthquake occurred. This inability is because it remains unclear whether foreshocks represent a different physical process than general seismicity (i.e., mainshocks and aftershocks). Several studies compared foreshock occurrence in real and synthetic catalogs, as simulated with a well-established earthquake triggering/forecasting model called Epidemic-Type Aftershock Sequence (ETAS) that does not discriminate between foreshocks, mainshocks, and aftershocks. Some of these studies show that the spatial distribution of foreshocks encodes information about the subsequent mainshock magnitude and that foreshock activity is significantly higher than predicted by the ETAS model. These findings attribute a unique underlying physical process to foreshocks, making them potentially useful for forecasting large earthquakes. We reinvestigate these scientific questions using high-quality earthquake catalogs and study carefully the influence of subjective parameter choices and catalog artifacts on the results. For instance, we use data from different regions, account for the short-term catalog incompleteness and its spatial variability, and explore different criteria for sequence selection and foreshock definition.

How to cite: Manganiello, E., Herrmann, M., and Marzocchi, W.: The prognostic value of foreshocks - a statistical reevaluation, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-8691,, 2021.

Francesco Serafini, Mark Naylor, Finn Lindgren, and Maximilian Werner

Recent years have seen a growth in the diversity of probabilistic earthquake forecasts as well as the advent of them being applied operationally. The growth of their use demands a deeper look at our ability to rank their performance within a transparent and unified framework. Programs such as the Collaboratory Study for Earthquake Predictability (CSEP)  have been at the forefront of this effort. Scores are quantitative measures of how well a dataset can be explained by a candidate forecast and allow forecasts to be ranked. A positively oriented score is said to be proper when, on average, the highest score is achieved by the closest model to the data generating one. Different meanings of closest lead to different proper scoring rules. Here, we prove that the Parimutuel Gambling score, used to evaluate the results of the 2009 Italy CSEP experiment, is generally not proper, and even for the special case where it is proper, it can still be used improperly. We show in detail the possible consequences of using this score for forecast evaluation. Moreover, we show that other well-established scores can be applied to existing studies to calculate new rankings with no requirement for extra information. We extend the analysis to show how much data are required, in principle, to distinguish candidate forecasts and therefore how likely it is to express a preference towards a forecast. This introduces the possibility of survey design with regard to the duration and spatial discretisation of earthquake forecasts. Our findings may contribute to more rigorous statements about the ability to distinguish between the predictive skills of candidate forecasts in addition to simple rankings.

How to cite: Serafini, F., Naylor, M., Lindgren, F., and Werner, M.: Ranking earthquake forecasts: On the use of proper scoring rules to discriminate forecasts, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7418,, 2021.

Kaoru Sawazaki

Waveforms from many aftershocks occurring immediately after a large earthquake tend to overlap in a seismogram, which makes it difficult to pick their P- and S-wave phases. Accordingly, to determine hypocenter and magnitude of the aftershocks becomes difficult and thereby causes deterioration of earthquake catalog. Using such deteriorated catalog may cause misevaluation of ongoing aftershock activity. Since aftershock activity is usually most intense in the early period after a large earthquake, requirement of early aftershock forecast and deterioration of the aftershock catalog are impatient.

Several methods for aftershock forecast, using deteriorated automatic earthquake catalog (Omi et al., 2016, 2019) or continuous seismic envelopes (Lippiello et al., 2016), have been proposed to overcome such a situation. In this study, I propose another method that evaluates excess probability of maximum amplitude (EPMA) due to aftershocks using a continuous seismogram. The proposed method is based on the extreme value statistics, which provides probability distribution of maximum amplitudes within constant time intervals. From the Gutenberg-Richter and the Omori-Utsu laws and a conventional ground motion prediction equation (GMPE), I derived this interval maximum amplitude (IMA) follows the Frechet distribution (or type Ⅱ extreme-value distribution). Using the Monte-Carlo based approach, I certified that this distribution is well applicable to IMAs and available for forecasting maximum amplitudes even if many seismograms are overlapped.

Applying the Frechet distribution to the first 3 hour-long seismograms of the 2008 Iwate-Miyagi Nairiku earthquake (MW 6.9), Japan, I computed the EPMAs for 4 days at 4 stations. The maximum amplitudes due to experienced aftershocks proceeded following mostly within the 10 % to 90 % EPMA curves. This performance may be acceptable for a practical use.

Differently from the catalog-based method, the proposed method is almost unaffected by overlap of seismograms even in early lapse times. Since it is based on a single station processing, even seismic “network” is not required, and can be easily deployed at locations of poor seismic network coverage. So far, this method is correctly applicable for typical mainshock-aftershock (Omori-Utsu-like) sequence only. However, potentially, it could be extended to multiple sequences including secondary aftershocks and remotely triggered earthquakes.

How to cite: Sawazaki, K.: Extreme value statistics of interval maximum amplitudes due to aftershocks and its application for early forecast, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-1019,, 2021.

Fatemeh Jalayer and Hossein Ebrahimian

On Sunday November 12, 2017, at 18:18:16 UTC, (21:48:16 local time), a strong earthquake with Mw7.3 occurred in western Iran in the border region between Iran and Iraq in vicinity of the Sarpol-e Zahab town. Unfortunately, this catastrophic seismic event caused 572 causalities, thousands of injured and vast amounts of damage to the buildings, houses and infrastructures in the epicentral area. The mainshock of this seismic sequence was felt in the entire western and central provinces of Iran and surrounding areas. The main event was preceded by a foreshock with magnitude 4.5 about 43 minutes before the mainshock that warned the local residence to leave their home and possibly reduced the number of human casualties. More than 2500 aftershocks with magnitude greater than 2.5 have been reported up to January 2019 with the largest registered aftershock of Mw6.4. A novel and fully-probabilistic procedure is adopted for providing spatio-temporal predictions of aftershock occurrence in a prescribed forecasting time interval (in the order of hours or days). The procedure aims at exploiting the information provided by the ongoing seismic sequence in quasi-real time. The versatility of the Bayesian inference is exploited to adaptively update the forecasts based on the incoming information as it becomes available. The aftershock clustering in space and time is modelled based on an Epidemic Type Aftershock Sequence (ETAS). One of the main novelties of the proposed procedure is that it considers the uncertainties in the aftershock occurrence model and its model parameters. This is done by moving within a framework of robust reliability assessment which enables the treatment of uncertainties in an integrated manner. Pairing up the Bayesian robust reliability framework and the suitable simulation schemes (Markov Chain Monte Carlo Simulation) provides the possibility of performing the whole forecasting procedure with minimum (or no) need of human interference. The fully simulation-based procedure is examined for both Bayesian model updating of ETAS spatio-temporal model and robust operational forecasting of the number of events of interest expected to happen in various time intervals after main events within the sequence. The seismicity is predicted within a confidence interval from the mean estimate.

How to cite: Jalayer, F. and Ebrahimian, H.: Operational Aftershock Forecasting for 2017-2018 Seismic Sequence in Western Iran, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-15824,, 2021.

Simone Mancini, Margarita Segou, and Maximilian J. Werner

Artificial intelligence methods are revolutionizing modern seismology by offering unprecedentedly rich seismic catalogs. Recent developments in short-term aftershock forecasting show that Coulomb rate-and-state (CRS) models hold the potential to achieve operational skills comparable to standard statistical Epidemic-Type Aftershock Sequence (ETAS) models, but only when the near real-time data quality allows to incorporate a more detailed representation of sources and receiver fault populations. In this framework, the high-resolution reconstructions of the seismicity patterns introduced by machine-learning-derived earthquake catalogs represent a unique opportunity to test whether they can be exploited to improve the predictive power of aftershock forecasts.

Here, we present a retrospective forecast experiment on the first year of the 2016-2017 Central Italy seismic cascade, where seven M5.4+ earthquakes occurred between a few hours and five months after the initial Mw 6.0 event, migrating over a 60-km long normal fault system. As target dataset, we employ the best available high-density machine learning catalog recently released for the sequence, which reports ~1 million events in total (~22,000 with M ≥ 2).

First, we develop a CRS model featuring (1) rate-and-state variables optimized on 30 years of pre-sequence regional seismicity, (2) finite fault slip models for the seven mainshocks of the sequence, (3) spatially heterogeneous receivers informed by pre-existing faults, and (4) updating receiver fault populations using focal planes gradually revealed by aftershocks. We then test the effect of considering stress perturbations from the M2+ events. Using the same high-precision catalog, we produce a standard ETAS model to benchmark the stress-based counterparts. All models are developed on a 3D spatial grid with 2 km spacing; they are updated daily and seek to forecast the space-time occurrence of M2+ seismicity for a total forecast horizon of one year. We formally rank the forecasts with the statistical scoring metrics introduced by the Collaboratory for the Study of Earthquake Predictability and compare their performance to a generation of CRS and ETAS models previously published for the same sequence by Mancini et al. (2019), who used solely real-time data and a minimum triggering magnitude of M=3.

We find that considering secondary triggering effects from events down to M=2 slightly improves model performance. While this result highlights the importance of better seismic catalogs to model local triggering mechanisms, it also suggests that to appreciate their full potential future modelling efforts will likely have to incorporate also fine-scale rupture characterizations (e.g., smaller source fault geometries retrieved from enhanced focal mechanism catalogs) and introduce denser spatial model discretizations.

How to cite: Mancini, S., Segou, M., and Werner, M. J.: Do Enhanced Seismicity Catalogs Improve Aftershock Forecasts? A Test on the 2016-2017 Central Italy Earthquake Cascade, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-14716,, 2021.

Shubham Sharma, Shyam Nandan, and Sebastian Hainzl

Currently, the Epidemic Type Aftershock Sequence (ETAS) model is state-of-the-art for forecasting aftershocks. However, the under-performance of ETAS in forecasting the spatial distribution of aftershocks following a large earthquake make us adopt alternative approaches for the modelling of the spatial ETAS-kernel. Here we develop a hybrid physics and statics based forecasting model. The model uses stress changes, calculated from inverted slip models of large earthquakes, as the basis of the spatial kernel in the ETAS model in order to get more reliable estimates of spatiotemporal distribution of aftershocks. We evaluate six alternative approaches of stress-based ETAS-kernels and rank their performance against the base ETAS model. In all cases, an expectation maximization (EM) algorithm is used to estimate the ETAS parameters. The model approach has been tested on synthetic data to check if the known parameters can be inverted successfully. We apply the proposed method to forecast aftershocks of mainshocks available in SRCMOD database, which includes 192 mainshocks with magnitudes in the range between 4.1 and 9.2 occurred from 1906 to 2020. The probabilistic earthquake forecasts generated by the hybrid model have been tested using established CSEP test metrics and procedures. We show that the additional stress information, provided to estimate the spatial probability distribution, leads to more reliable spatiotemporal ETAS-forecasts of aftershocks as compared to the base ETAS model.

How to cite: Sharma, S., Nandan, S., and Hainzl, S.: Testing physics and statics based hybrid ETAS models, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-5157,, 2021.

Hamed Ali Diab-Montero, Meng Li, Ylona van Dinther, and Femke C Vossepoel

Our ability to forecast earthquake events is hampered by limited information of the state of stress and strength of faults and their governing parameters. Ensemble data assimilation methods provide a means to estimate these variables by combining physics-based models and observations taking into account their uncertainties. In this study, we estimate earthquake occurrences in synthetic experiments representing a meter-scale laboratory setup of a straight-fault governed by rate-and-state friction. We test an Ensemble Kalman Filter implemented in the Parallel Data Assimilation Framework, which is connected with a 1D forward model using the numerical library GARNET. A perfect-model test shows that the filter can estimate shear stresses, slip rates and state θ acting on the fault even when simulating slip rates up to m/s and can thus be used for estimating earthquake occurrences. We assimilate shear stress and slip-rate observations, representing measurements obtained from shear strain gauges and piezoelectric transducers sensors, and their uncertainties acquired at a small distance to the fault in the homogeneous elastic medium. In this study we evaluate how the Ensemble Kalman filter estimates the state and strength of the faults using these observations, and assess the relative influence of assimilating various observations. The results suggest that the data assimilation improves the estimated timing of the earthquake occurrences. The assimilation of the shear stress observed in the medium improves in particular the estimates of the state θ and the shear stress on the fault, while assimilating observations of velocity in the medium improves the slip-rate estimation.

How to cite: Diab-Montero, H. A., Li, M., van Dinther, Y., and Vossepoel, F. C.: Ensemble Kalman Filter estimates shear stress and seismic slip rates in synthetic laboratory experiment, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-6836,, 2021.

Stefan Weginger

The Seismological Service of earthquakes Austria located at the Zentralanstalt für Meteorologie und Geodynamik (ZAMG; is the federal agency for monitoring and seismic hazard assessments in Austria. The ZAMG seismological network currently consists of approximately 60 Stations (28 broadband and ~30 strong motion). The increasing number of earthquake records during the past 20 years - from the local seismic network with addition of surrounded stations from the EIDA archive - are used to determinate the local attenuation of seismic shear wave energies and the site amplifications.

The propagation path of the released seismic energy through the uppermost crust has a considerable influence on the ground movement recorded at the surface. This contribution follows the known method of combining a physical modeling and a statistical approach by separating the source, path and site effects by means of inversion. A geometrical spreading model defines the frequency-independent decay of energy while the frequency dependent decay is parameterized by the parameters Q and Kappa. The moment magnitudes and stress drops of small earthquakes were determined, and a scaling-relationship of local to moment magnitudes could be established.

Taking these site-effects into consideration reduces the uncertainties in the determination of empirical ground motion prediction equations (GMPE) and increases the accuracy of seismic hazard assessments, ShakeMaps and On-Site systems.

How to cite: Weginger, S.: Determination of site amplifications and moment magnitudes of East-Alpine earthquakes, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-5379,, 2021.

Marisol Monterrubio-Velasco, J. Carlos Carrasco-Jimenez, Otilio Rojas, Juan E. Rodriguez, David Modesto, and Josep de la Puente

After large magnitude earthquakes have been recorded, a crucial task for hazard assessment is to quickly estimate Ground Shaking (GS) intensities at the affected region. Urgent physics-based earthquake simulations using High-Performance Computing (HPC) facilities may allow fast GS intensity analyses but are very sensitive to source parameter values. When using fast estimates of source parameters such as magnitude, location, fault dimensions, and/or Centroid Moment Tensor (CMT), simulations are prone to errors in their computed GS. Although the approaches to estimate earthquake location and magnitude are consolidated, depth location estimates are largely uncertain. Moreover, automatic CMT solutions are not always provided by seismological agencies, or such solutions are available at later times after waveform inversions allow the determination of moment tensor components. The uncertainty on these parameters, especially a few minutes after the earthquake has been registered, strongly affects GS maps resulting from simulations.

In this work, we present a workflow prototype to produce an uncertainty quantification method as a function of the source parameters. The core of this workflow is based on Machine Learning (ML) techniques. As a study case, we consider a domain of 110x80 km centered in 63.9ºN-20.6ºW in Southern Iceland, where the 17 best-mapped faults have hosted the historical events of the largest magnitude. We generate synthetic GS intensity maps using the AWP-ODC finite-difference code for earthquake simulation and a one-dimensional velocity model, with 40 recording surface stations. By varying a few source parameters (e.g. event magnitude, CMT, and hypocenter location), we finally model tens of thousands of hypothetical earthquakes. Our ML analog will then be able to relate GS intensity maps to source parameters, thus simplifying sensitivity studies.

Additionally, the results of this workflow prototype will allow us to obtain ML-based intensity maps a few seconds after an earthquake occurs exploiting the predictive power of ML techniques. We will evaluate the accuracy of these maps as standalone complements to GMPEs and simulations.

How to cite: Monterrubio-Velasco, M., Carrasco-Jimenez, J. C., Rojas, O., Rodriguez, J. E., Modesto, D., and de la Puente, J.: Source Parameter Sensitivity of Earthquake Simulations assisted by Machine Learning , EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-5995,, 2021.