Smart monitoring and observation systems for natural hazards, including satellites, seismometers, global networks, unmanned vehicles (e.g., UAV), and other linked devices, have become increasingly abundant. With these data, we observe the restless nature of our Earth and work towards improving our understanding of natural hazard processes such as landslides, debris flows, earthquakes, floods, storms, and tsunamis. The abundance of diverse measurements that we have now accumulated presents an opportunity for earth scientists to employ statistically driven approaches that speed up data processing, improve model forecasts, and give insights into the underlying physical processes. Such big-data approaches are supported by the wider scientific, computational, and statistical research communities who are constantly developing data science and machine learning techniques and software. Hence, data science and machine learning methods are rapidly impacting the fields of natural hazards and seismology. In this session, we will see research from natural hazards and seismology for processes over a broad range of time and spatial scales.

Dr. Pui Anantrasirichai of the University of Bristol, UK will give the invited presentation:
Application of Deep Learning to Detect Ground Deformation in InSAR Data

Co-organized by ESSI2/GI2/GM2/HS12/NP4/SM1
Convener: Hui TangECSECS | Co-conveners: Kejie ChenECSECS, Stephanie OlenECSECS, Fabio CorbiECSECS, Jannes Münchmeyer
| Attendance Wed, 06 May, 08:30–10:15 (CEST)

Files for download

Download all presentations (147MB)

Chat time: Wednesday, 6 May 2020, 08:30–10:15

D2362 |
Pui Anantrasirichai, Juliet Biggs, Fabien Albino, and David Bull

Satellite interferometric synthetic aperture radar (InSAR) can be used for measuring surface deformation for a variety of applications. Recent satellite missions, such as Sentinel-1, produce a large amount of data, meaning that visual inspection is impractical. Here we use deep learning, which has proved successful at object detection, to overcome this problem. Initially we present the use of convolutional neural networks (CNNs) for detecting rapid deformation events, which we test on a global dataset of over 30,000 wrapped interferograms at 900 volcanoes. We compare two potential training datasets: data augmentation applied to archive examples and synthetic models. Both are able to detect true positive results, but the data augmentation approach has a false positive rate of 0.205% and the synthetic approach has a false positive rate of 0.036%.  Then, I will present an enhanced technique for measuring slow, sustained deformation over a range of scales from volcanic unrest to urban sources of deformation such as coalfields. By rewrapping cumulative time series, the detection performance is improved when the deformation rate is slow, as more fringes are generated without altering the signal to noise ratio. We adapt the method to use persistent scatterer InSAR data, which is sparse in nature,  by using spatial interpolation methods such as modified matrix completion Finally, future perspectives for machine learning applications on InSAR data will be discussed.

How to cite: Anantrasirichai, P., Biggs, J., Albino, F., and Bull, D.: Application of Deep Learning to Detect Ground Deformation in InSAR Data, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-4146, https://doi.org/10.5194/egusphere-egu2020-4146, 2020.

D2363 |
Natalia Galina, Nikolai Shapiro, Leonard Seydoux, and Dmitry Droznin

Kamchatka is an active subduction zone that exhibits intense seismic and volcanic activities. As a consequence, tectonic and volcanic earthquakes are often nearly simultaneously recorded at the same station. In this work, we consider seismograms recorded between December 2018 and April 2019. During this time period when the M=7.3 earthquake followed by an aftershock sequence occurred nearly simultaneously with a strong eruption of Shiveluch volcano. As a result, stations of the Kamchatka seismic monitoring network recorded up to several hundreds of earthquakes per day. In total, we detected almost 7000 events of different origin using a simple automatic detection algorithm based on signal envelope amplitudes. Then, for each detection different features have been extracted. We started from simple signal parameters (amplitude, duration, peak frequency, etc.), unsmoothed and smoothed spectra and finally used a multi-dimensional signal decomposition (scattering coefficients). For events classification both unsupervised (K-means, agglomerative clustering) and supervised (Support Vector Classification, Random Forest) classic machine learning techniques were performed on all types of extracted features. Obtained results are quite stable and do not vary significantly depending on features and method choice. As a result, the machine learning approaches allow us to clearly separate tectonic subduction-zone earthquakes and those associated with the Shiveluch volcano eruptions based on data of a single station.

How to cite: Galina, N., Shapiro, N., Seydoux, L., and Droznin, D.: Classification of volcanic and tectonic earthquakes in Kamchatka (Russia) with different machine learning techniques, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-755, https://doi.org/10.5194/egusphere-egu2020-755, 2020.

D2364 |
Chia Yu Wang, Ting Chung Huang, and Yih Min Wu

On-site Earthquake Early Warning (EEW) systems estimate possible destructive S-waves based on initial P-waves and issue warnings before large shaking arrives. On-site EEW plays a crucial role to fill up the “blind zone” of regional EEW systems near the epicenter, which often suffers from the most disastrous ground shaking. Previous studies show that peak P-wave displacement amplitude (Pd) may provide a possible indicator of destructive earthquakes. However, the attempt to use a single indicator with fixed thresholds suffers from inevitable misfits, since the diversity in travel paths and site effects for different stations introduce complex nonlinearities. To overcome the above problem, we present a deep learning approach using Long-Short Term Memory (LSTM) neural networks. By utilizing the properties of multi-layered LSTM, we are able to train a highly non-linear neural network that takes initial waveform as input and gives an alert probability as the output on every time step. It is then tested with several major earthquake events, giving the results of a missed alarm rate less than 0.03 percent and false alarm rate less than 15 percent. Our model shows promising outcomes in reducing both missed alarms and false alarms while also providing an improving warning time for hazard mitigation procedures.

How to cite: Wang, C. Y., Huang, T. C., and Wu, Y. M.: A LSTM Neural Network for On-site Earthquake Early Warning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3696, https://doi.org/10.5194/egusphere-egu2020-3696, 2020.

D2365 |
Giuseppe Costantino, Mauro Dalla Mura, David Marsan, Sophie Giffard-Roisin, Mathilde Radiguet, and Anne Socquet

The deployment of increasingly dense geophysical networks in many geologically active regions on the Earth has given the possibility to reveal deformation signals that were not detectable beforehand. An example of these newly discovered signals are those associated with low-frequency earthquakes, which can be linked with the slow slip (aseismic slip) of faults. Aseismic fault slip is a crucial phenomenon as it might play a key role in the precursory phase before large earthquakes (in particular in subduction zones), during which the seismicity rate grows as well as does the ground deformation. Geodetic measurements, e.g. the Global Positioning System (GPS), are capable to track surface deformation transients likely induced by an episode of slow slip. However, very little is known about the mechanisms underlying this precursory phase, in particular regarding to how slow slip and seismicity relate.

The analysis done in this work focuses on recordings acquired by the Japan Meteorological Agency in the Boso area, Japan. In the Boso peninsula, interactions between seismicity and slow slip events can be observed over different time spans: regular slow slip events occur every 4 to 5 years, lasting about 10 days, and are associated with a burst of seismicity (Hirose et al. 2012, 2014, Gardonio et al. 2018), whereas an accelerated seismicity rate has been observed over decades that is likely associated with an increasing shear stress rate (i.e., tectonic loading) on the subduction interface (Ozawa et al. 2014, Reverso et al. 2016, Marsan et al. 2017).

This work aims to explore the potential of  Deep Learning  for better characterizing the interplay between seismicity and ground surface deformation. The analysis is based on a data-driven approach for building a model for assessing if a link seismicity – surface deformation exists and to characterize the nature of this link. This has potentially strong implications, as (small) earthquakes are the prime observable, so that better understanding the seismicity rate response to potentially small slow slip (so far undetected by GPS) could help monitoring those small slow slip events. The statistical problem is expressed as a regression between some features extracted from the seismic data and the GPS displacements registered at one or more stations.

The proposed method, based on a Long-Short Term Memory (LSTM) neural network, has been designed in a way that it is possible to estimate which features are more relevant in the estimation process. From a geophysical point of view, this can provide interesting insights for validating the results, assessing the robustness of the algorithms and giving insights on the underlying process. This kind of approach represents a novelty in this field, since it opens original perspectives for the joint analysis of seismic / aseismic phenomena with respect to traditional methods based on more classical geophysical data exploration.

How to cite: Costantino, G., Dalla Mura, M., Marsan, D., Giffard-Roisin, S., Radiguet, M., and Socquet, A.: Towards assessing the link between slow slip and seismicity with a Deep Learning approach, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-20009, https://doi.org/10.5194/egusphere-egu2020-20009, 2020.

D2366 |
Artemii Novoselov, Gerrit Hein, Goetz Bokelmann, and Florian Fuchs

Any time series can be represented as a sum of sine waves with the help of the Fourier transform. But such a transformation doesn’t answer whether the signal is coming from one source or several; neither it allows separation of such sources. In this work, we present a technique from the Machine Learning domain, called Auto-encoders that utilizes the ability of the neural network to generate signals from the latent space, which in turn allows us to identify signals from an arbitrary number of sources and can generate them as separate waveforms without any loss. We took ground motion records of passing trains and trams in the vicinity of the University of Vienna and trained the network to produce “clean” individual signals from “mixed” waveforms. This work proves the concept and steers the direction for further research of earthquake-induced source separation. It also benefits interference seismometry, since “noise” used for such research can be separated from the signal, thus reducing manual processing (cutting and clipping signals) of seismic records. 

How to cite: Novoselov, A., Hein, G., Bokelmann, G., and Fuchs, F.: TraML: separation of seismically-induced ground-motion signals with Autoencoder architecture, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-1484, https://doi.org/10.5194/egusphere-egu2020-1484, 2020.

D2367 |
Jiahua Zhao, Miaki Ishii, Hiromi Ishii, and Thomas Lee

Analog seismograms contain rich and valuable information over nearly a century. However, these analog seismic records are difficult to analyze quantitatively using modern techniques that require digital time series. At the same time, because these seismograms are deteriorating with age and need substantial storage space, their future has become uncertain. Conversion of the analog seismograms to digital time series will allow more conventional access and storage of the data as well as making them available for exciting scientific discovery. The digitization software, DigitSeis, reads a scanned image of a seismogram and generates digitized and timed traces, but the initial step of recognizing trace and time mark segments, as well as other features such as hand-written notes, within the image poses certain challenges. Armed with manually processed analyses of image classification, we aim to automate this process using machine learning algorithms. The semantic segmentation methods have made breakthroughs in many fields. In order to solve the problem of accurate classification of scanned images for analog seismograms, we develop and test an improved deep convolutional neural network based on U-Net, Improved U-Net, and a deeper network segmentation method that adds the residual blocks, ResU-Net. There are two segmentation objects are the traces and time marks in scanned images, and the goal is to train a binary classification model for each type of segmentation object, i.e., there are two models, one for trace objects and another for time mark objects, for each of the neural networks. The networks are trained on the 300 images of the digitizated results of analog seismograms from Harvard-Adam Dziewoński Observatory from 1939. Application of the algorithms to a test data set results in the pixel accuracy (PA) for the Improved U-Net of 95% for traces and nearly 100% for time marks, with Intersection over Union (IoU) of 79% and 75% for traces and time marks, respectively. The PA of ResU-Net are 97% and nearly 100% for traces and time marks, with IoU of 83% and 74%. These experiments show that Improved U-Net is more effective for semantic segmentation of time marks, while ResU-Net is more suitable for traces. In general, both network models work well in separating and identifying objects, and provide a significant step forward in nearly automating digitizing analog seismograms.

How to cite: Zhao, J., Ishii, M., Ishii, H., and Lee, T.: Application of Improved U-Net and ResU-Net Based Semantic Segmentation Method for Digitization of Analog Seismograms, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-4294, https://doi.org/10.5194/egusphere-egu2020-4294, 2020.

D2368 |
Jannes Münchmeyer, Dino Bindi, Ulf Leser, and Frederik Tilmann

The key task of earthquake early warning is to provide timely and accurate estimates of the ground shaking at target sites. Current approaches use either source or propagation based methods. Source based methods calculate fast estimates of the earthquake source parameters and apply ground motion prediction equations to estimate shaking. They suffer from saturation effects for large events, simplified assumptions and the need for a well known hypocentral location, which usually requires arrivals at multiple stations. Propagation based methods estimate levels of shaking from the shaking at neighboring stations and therefore have short warning times and possibly large blind zones. Both methods only use specific features from the waveform. In contrast, we present a multi-station neural network method to estimate horizontal peak ground acceleration (PGA) anywhere in the target region directly from raw accelerometer waveforms in real time.

The three main components of our model are a convolutional neural network (CNN) for extracting features from the single-station three-component accelerograms, a transformer network for combining features from multiple stations and for transferring them to the target site features and a mixture density network to generate probabilistic PGA estimates. By using a transformer network, our model is able to handle a varying set and number of stations as well as target sites. We train our model end-to-end using recorded waveforms and PGAs. We use data augmentation to enable the model to provide estimations at targets without waveform recordings. Starting with the arrival of a P wave at any station of the network, our model issues real-time predictions at each new sample. The predictions are Gaussian mixtures, giving estimates of both expected value and uncertainties. The model can be used to predict PGA at specific target sites, as well as to generate ground motion maps.

We analyze the model on two strong motion data sets from Japan and Italy in terms of standard deviation and lead times. Through the probabilistic predictions we are able to give lead times for different levels of uncertainty and ground shaking. This allows to control the ratio of missed detections to false alerts. Preliminary analysis suggest that for levels between 1%g and 10%g our model achieves multi-second lead times even for the closest stations at a false-positive rate below 25%. For an example event at 50 km depth, lead times at the closest stations with epicentral distances below 20 km are 6 s and 7.5 s. This suggests that our model is able to effectively use the difference between P and S travel time and accurately assess the future level of ground shaking from the first parts of the P wave. It additionally makes effective use of the information contained in the absence of signal at other stations.

How to cite: Münchmeyer, J., Bindi, D., Leser, U., and Tilmann, F.: End-to-end PGA estimation for earthquake early warning using transformer networks, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5107, https://doi.org/10.5194/egusphere-egu2020-5107, 2020.

D2369 |
Brydon Lowney, Ivan Lokmer, Gareth Shane O'Brien, and Christopher Bean

Diffractions are a useful aspect of the seismic wavefield and are often underutilised. By separating the diffractions from the rest of the wavefield they can be used for various applications such as velocity analysis, structural imaging, and wavefront tomography. However, separating the diffractions is a challenging task due to the comparatively low amplitudes of diffractions as well as the overlap between reflection and diffraction energy. Whilst there are existing analytical methods for separation, these act to remove reflections, leaving a volume which contains diffractions and noise. On top of this, analytical separation techniques can be costly computationally as well as requiring manual parameterisation. To alleviate these issues, a deep neural network has been trained to automatically identify and separate diffractions from reflections and noise on pre-migration data.

Here, a Generative Adversarial Network (GAN) has been trained for the automated separation. This is a type of deep neural network architecture which contains two neural networks which compete against one another. One neural network acts as a generator, creating new data which appears visually similar to the real data, while a second neural network acts as a discriminator, trying to identify whether the given data is real or fake. As the generator improves, so too does the discriminator, giving a deeper understanding of the data. To avoid overfitting to a specific dataset as well as to improve the cross-data applicability of the network, data from several different seismic datasets from geologically distinct locations has been used in training. When comparing a network trained on a single dataset compared to one trained on several datasets, it is seen that providing additional data improves the separation on both the original and new datasets.

The automatic separation technique is then compared with a conventional, analytical, separation technique; plane-wave destruction (PWD). The computational cost of the GAN separation is vastly superior to that of PWD, performing a separation in minutes on a 3-D dataset in comparison to hours. Although in some complex areas the GAN separation is of a higher quality than the PWD separation, as it does not rely on the dip, there are also areas where the PWD outperforms the GAN separation. The GAN may be enhanced by adding more training data as well as by improving the initial separation used to create the training data, which is based around PWD and thus is imperfect and can introduce bias into the network. A potential for this is training the GAN entirely using synthetic data, which allows for a perfect separation as the points are known, however, it must be of sufficient volume for training and sufficient quality for real data applicability.

How to cite: Lowney, B., Lokmer, I., O'Brien, G. S., and Bean, C.: Classification and Separation of Diffraction Energy on Pre-Migration Seismic Data using Deep Learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5376, https://doi.org/10.5194/egusphere-egu2020-5376, 2020.

D2370 |
Josipa Majstorović and Piero Poli

The machine learning (ML) algorithms have already found their application in standard seismological procedures, such as earthquake detection and localization, phase picking, earthquake early warning system, etc. They are progressively becoming superior methods since one can rapidly scan voluminous data and detect earthquakes, even if buried in highly noisy time series.

We here make use of ML algorithms to obtain more complete near fault seismic catalogs and thus better understand the long-term (decades) evolution of seismicity before large earthquakes occurrence. We focus on data recorded before the devastating L’Aquila earthquake (6 April 2009 01:32 UTC, Mw6.3) right beneath the city of L’Aquila in the Abruzzo region (Central Italy). Before this event sparse stations were available, reducing the magnitude completeness of standard catalogs. 

We adapted existing convolutional neural networks (CNN) for earthquake detection, localisation and characterization using a single-station waveforms. The CNN is applied to 29 years of data (1990 to 2019) recorded at the AQU station, located near the city of L’Aquila (Italy). The pre-existing catalog maintained by Istituto nazionale di geofisica e vulcanologia is used to define labels and train and test the CNN. We are here interested in classifying the continuous three-component waveforms into four categories, noise/earthquakes, distance (location), magnitude and depth, where each category consists of several nodes. Existing seismic catalogs are used to label earthquakes, while the noise events are randomly selected between the catalog events, evenly represented by daytime and night-time periods.

We prefer CNN over other methods, since we can use seismograms directly with very minor pre-processing (e.g. filtering) and we do not need any prior knowledge of the region.

How to cite: Majstorović, J. and Poli, P.: Extending near fault earthquakes catalogs using convolutional neural network and single-station waveforms, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5686, https://doi.org/10.5194/egusphere-egu2020-5686, 2020.

D2371 |
Han-Saem Kim, Chang-Guk Sun, Hyung-Ik Cho, and Moon-Gyo Lee

Earthquake-induced land deformation and structure failure are more severe over soft soils than over firm soils and rocks owing to the seismic site effect and liquefaction. The site-specific seismic site effect related to the amplification of ground motion has spatial uncertainty depend on the local subsurface, surface geological, and topographic conditions. When the 2017 Pohang earthquake (M 5.4), South Korea’s second-strongest earthquake in decades, occurred, the severe damages influencing by variable site effect indicators were observed focusing on the basin or basin-edge region deposited unconsolidated Quaternary sediments. Thus, the site characterization is essential considering empirical correlations with geotechnical site response parameters and surface proxies. Furthermore, in the case of so many variables and tenuously related correlations, machine learning classification models can prove to be very precise than the parametric methods. In this study, the multivariate seismic site classification system was established using the machine learning technique based on the geospatial big data platform.

The supervised machine learning classification techniques and more specifically, random forest, support vector machine (SVM), and artificial neural network (ANN) algorithms have been adopted. Supervised machine learning algorithms analyze a set of labeled training data consisting of a set of input data and desired output values, and produce an inferred function which can be used for predictions from given input data. To optimize the classification criteria by considering the geotechnical uncertainty and local site effects, the training datasets applying principal component analysis (PCA) were verified with k-fold cross-validation. Moreover, the optimized training algorithm, proved by loss estimators (receiver operating characteristic curve (ROC), the area under the ROC curve (AUC)) based on the confusion matrix, was selected.

For the southeastern region in South Korea, the boring log information (strata, standard penetration test, etc.), geological map (1:50k scale), digital terrain model (having 5 m × 5 m), soil map (1:250k scale) were collected and constructed as geospatial big data. Preliminarily, to build spatially coincided datasets with geotechnical response parameters and surface proxies, the mesh-type geospatial information was built by the advanced geostatistical interpolation and simulation methods.

Site classification systems use seismic response parameters related to the geotechnical characteristics of the study area as the classification criteria. The current site classification systems in South Korea and the United States recommend Vs30, which is the average shear wave velocity (Vs) up to 30 m underground. This criterion uses only the dynamic characteristics of the site without considering its geometric distribution characteristics. Thus, the geospatial information for the input layer included the geo-layer thickness, surface proxies (elevation, slope, geological category, soil category), average Vs for soil layer (Vs,soil) and site period (TG). The Vs30-based site class was defined as categorical labeled data. Finally, the site class can be predicted using only proxies based on the optimized classification techniques.

How to cite: Kim, H.-S., Sun, C.-G., Cho, H.-I., and Lee, M.-G.: Machine Learning based Multivariate Seismic Site Classification System for South Korea, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-10937, https://doi.org/10.5194/egusphere-egu2020-10937, 2020.

D2372 |
Jaewon Yoo and Jaehun Ahn

It is an important task to model and predict seismic ground response; the results of ground response analysis are, in turn, used to assess liquefaction and integrity of undergound and upper structures. There has been numerious research and development on modelling of seismic ground response, but often there are quite large difference between prediction and measurement. In this study, it is attempted to train the input and output ground excitation data and make prediction based on it. To initiate this work, the deep learning network was trained for low level excitation data; the results showed reasonable match with actual measurements.

ACKNOWLEDGEMENT : The authors would like to thank the Ministry of Land, Infrastructure, and Transport of Korean government for the grant from Technology Advancement Research Program (grant no. 20CTAP-C152100-02) and Basic Science Research Program (grant no. 2017R1D1A3B03034563) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education.

How to cite: Yoo, J. and Ahn, J.: Data Driven Prediction of Seismic Ground Response under Low Level Excitation, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-12721, https://doi.org/10.5194/egusphere-egu2020-12721, 2020.

D2373 |
Luis Fernandez-Prieto, Antonio Villaseñor, and Roberto Cabieces

Ocean Bottom Seismometers (OBS) are the primary instruments used in the study of marine seismicity. Due to the characteristics of their emplacement on the sea bottom, these instruments have a much lower signal-noise ratio than land seismometers. Therefore, difficulties arise on the analysis of the data, specially when using automatic methods.

During recent years the use of machine learning methods applied to seismic signal analysis has increased significantly. We have developed a neural network algorithm that allows to pick seismic body signals, allowing to correctly identify P and S waves with a precision higher than 98%. This network was trained using data of the Southern California Seismic Network and was applied satisfactorily in analysis of data from Large-N experiments in different regions from Europe and Asia.

One of the remarkable characteristics of the network is the ability to identify the noise, both in the case of seismic signals with low signal-noise ratio and in the case of large amplitude non-seismic signals, such as human-induced noise. This feature makes the network an optimal candidate to study data recorded using OBS.

We have modified this neural network in order to analyze OBS data from different deployments. Combined with the use of an associator, we have successfully located events with very low signal-noise ratio, achieving results with a precision comparable or superior to a human operator.

How to cite: Fernandez-Prieto, L., Villaseñor, A., and Cabieces, R.: Deep Learning P and S wave phase picking of Ocean Bottom Seismometer (OBS) data, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-18818, https://doi.org/10.5194/egusphere-egu2020-18818, 2020.

D2374 |
Piero Poli and Josipa majstorovic

The exponential growth of geophysical (seismological in particular) data we faced in the last years, made it hard to quantitatively label (e.g. systematical separation of earthquakes and noise) the daily and continuous stream of records. On the other hand, these data are likely to contain an enormous amount of information about the physical processes occurring inside our planet, including new and original signals that can shed light on new physics of the crustal rocks.

Of particular interest are data recorded near major faults,where one hope to detect and discover new signals possibly associated with precursory phase of significant and hazardous earthquakes.



With the above ideas in mind, we perform an unsupervised classification of 30 years of seismological data recorded at ~10km from the L’Aquila fault (in Italy), which hosted a magnitude 6 event (in 2009) and still poses a significant hazard for the region.


We based our classification on daily spectra of three component data and relative spectral features. We then utilize self-organizing map (SOM) to perform a crude clustering of the 30 years of data. The data reduction offered by SOM permits a rapid visualization of this large datasets (~11k spectra) and individuation of main spectral groups. In a further step, we test different clustering algorithms (including hierarchical ones) to isolate groups of records sharing similar features, in a non-subjective manner. We believe that from the quantitative analysis (e.g. temporal evolution) of the retrieved clusters, the signature of fault physical processes (e.g. preparation of the magnitude 6 earthquake, in our case) can be retrieved. The newly detected signals will then be analyzed to learn more about the causative processes, generating them.

How to cite: Poli, P. and majstorovic, J.: Unsupervised classification of 30 years of near-fault seismological data: What can we learn about fault physics?, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-13442, https://doi.org/10.5194/egusphere-egu2020-13442, 2020.

D2375 |
Nikita Afonin and Elena Kozlovskaya

In some problems of solid Earth geophysics analysis of the huge amount of continuous seismic data is necessary. One of such problems is an investigation of so-called frost quakes or cryoseisms in the Arctic caused by extreme weather events. Weather extremes such as rapid temperature decrease in combination with thin snow cover can result in cracking of water-saturated soil and rock when the water has suddenly frozen and expanded. As cryoseisms can be hazardous for industrial and civil objects located in the near-field zone, their monitoring and analysis of weather conditions during which they occur, is necessary to access hazard caused by extreme weather events. One of the important tasks in studying cryoseisms is the development of efficient data processing routine capable to separate cryoseisms from other seismic events and noise in continuous seismic data. In our study, we present an algorithm for identification of cryoseisms that is based on classical STA/LTA algorithm for seismic event detection and neural network for their classification using selected characteristics of the records.

To access characteristics of cryoseisms, we used 3-component recordings of a swarm of strong cryoseismic events with similar waveforms that were registered on 06.06.2016 by seismic station OUL in northern Finland. The strongest event from the swarm produced a fracture on the road surface and damaged basements of buildings in the municipality of Oulu. Assuming that all events in the swarm were caused by the same mechanism (freezing of water-saturated soil), we used them as a learning sample for the neural network. Analysis of these events has shown that most of them have many similarities in selected records characteristics (central frequencies, duration etc.) with the strongest event and with each other. Application of this algorithm to the continuous seismic data recorded since the end of November 2015 to the end of February 2016, showed that the number of cryoseisms per day strongly correlates with variations of air temperature.

How to cite: Afonin, N. and Kozlovskaya, E.: Development of events detector for monitoring cryoseisms in upper soils, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-7744, https://doi.org/10.5194/egusphere-egu2020-7744, 2020.

D2376 |
Josefine Umlauft, Philippe Roux, Florent Gimbert, Albanne Lecointre, Bertrand Rouet-LeDuc, Daniel Taylor Trugman, and Paul Allan Johnson

The cryosphere is a highly active and dynamic environment that rapidly responds to changing climatic conditions. processes behind are poorly understood they remain challenging to observe. Glacial dynamics are strongly intermittent in time and heterogeneous in space. Thus, monitoring with high spatio-temporal resolution is essential. In course of the RESOLVE project, continuous seismic observations were obtained using a dense seismic network (100 nodes, Ø 700 m) installed on the Argentière Glacier (French Alpes) during May in 2018. This unique data set offers the chance to study targeted processes and dynamics within the cryosphere on a local scale in detail.

We classical beamforming within the of the array (matched field processing) and unsupervised machine learning techniques to identify, cluster and locate seismic sources in 5D (x, y, z, velocity, time). Sources located with high resolution and accuracy related to processes and activity within the ice body, e.g. the geometry and dynamics of crevasses or the interaction at the glacier/bedrock interface, depending on the meteorological conditions such as daily temperature fluctuations or snow fall. Our preliminary results indicate strong potential in poorly resolved sources, which can be observed with statistical consistency reveal new insights into structural features/ physical properties of the glacier (e.g. analysis of scatterers).

How to cite: Umlauft, J., Roux, P., Gimbert, F., Lecointre, A., Rouet-LeDuc, B., Trugman, D. T., and Johnson, P. A.: Characterizing glacial processes applying classical beamforming and machine learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3493, https://doi.org/10.5194/egusphere-egu2020-3493, 2020.

D2377 |
Martin Rogers

East Anglia is particularly vulnerable to sea level rise, increases in storminess, coastal erosion, and coastal flooding. Critical national infrastructure (including Sizewell’s nuclear power stations and the Bacton gas terminals), population centres close to the coastal zone (> 600,000 in Norfolk and Suffolk) and iconic natural habitats (the Broads, attracting 7 million visitors a year) are under threat. Shoreline change, driven by complex interactions between environmental forcing factors and human shoreline modifications, is a key determinant of coastal vulnerability and exposure; its prediction is imperative for future coastal risk adaptation.

An automated, python-supported, tool has been developed to simultaneously extract the water and vegetation line from satellite imagery. PlanetLab multispectral optical imagery is used to provide multi-year, frequent (up to fortnightly) images with 3-5m spatial resolution. Net shoreline change (NSC) has been calculated along multiple stretches of the East Coast of England, most notably for areas experiencing varying rates of change in front of, and adjacent to, ‘hard’ coastal defences. The joint use of water and vegetation line proxies enables calculation of inter-tidal width variability alongside NSC. The image resolution used provides new opportunities for data-led approaches to the monitoring of shoreline response to storm events and/or human shoreline modification.

Artificial Neural Networks (ANN) have been trained to predict shoreline evolution until 2040. Early results are presented, alongside considerations surrounding data pre-processing and input parameter selection requirements. Training data comprises decadal-scale shoreline positions recovered using automated shoreline detection. Shoreline position, alongside databases of nearshore bathymetry, sea defences, artificial beach renourishment, nearshore processes (wave and tide gauge data, meteorological fields), combined with land cover, population and infrastructure data act as inputs. Optimal input filtering and ANN configuration is derived using hindcasts.

The research is timely; ANN predictions are compared with the Anglian Shoreline Management Plans (SMPs), which identify locations at greatest risk and assign future risk management funding. The findings of this research will feed into future revisions of the plans.

How to cite: Rogers, M.: Exploiting satellite technology and machine learning to describe and predict hazardous shoreline change , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-276, https://doi.org/10.5194/egusphere-egu2020-276, 2020.

D2378 |
Melanie Brandmeier, Wolfgang Deigele, Zayd Hamdi, and Christoph Straub

Due to climate change the number of storms and, thus, forest damage has increased over recent years. The state of the art of damage detection is manual digitization based on aerial images and requires a great amount of work and time. There have been numerous attempts to automatize this process in the past such as change detection based on SAR and optical data or the comparison of Digital Surface Models (DSMs) to detect changes in the mean forest height. By using Convolutional Neural Networks (CNNs) in conjunction with GIS we aim at completely streamlining the detection and mapping process.

We developed and tested different CNNs for rapid windthrow detection based on Planet data that is rapidly available after a storm event, and on airborne data to increase accuracy after this first assessment. The study area is in Bavaria (ca. 165 square km) and data was provided by the agency for forestry (LWF). A U-Net architecture was compared to other approaches using transfer learning (e.g. VGG32) to find the most performant architecture for the task on both datasets.  U-Net was originally developed for medical image segmentation and has proven to be very powerful for other classification tasks.

Preliminary results highlight the potential of Deep Learning algorithms to detect damaged areas with accuracies of over 91% on airborne data and 92% on Planet data. The proposed workflow with complete integration into ArcGIS is well-suited for rapid first assessments after a storm event that allows for better planning of the flight campaign, and first management tasks followed by detailed mapping in a second stage.

How to cite: Brandmeier, M., Deigele, W., Hamdi, Z., and Straub, C.: Synergetic use of Planet data and high-resolution aerial images for windthrow detection based on Deep Learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-31, https://doi.org/10.5194/egusphere-egu2020-31, 2020.

D2379 |
Leonardo Santos, Emerson Silva, Cíntia Freitas, and Roberta Bacelar

Empirical models have been applied in several works in the literature on hydrological modeling. However, there is still an open question about uncertainties propagation in those models. In this paper, we developed an empirical hydrological model under a Machine Learning approach. Using the Keras interface and TensorFlow library, we trained and tested a Multilayer Perceptron. Our case study was conducted with data from the city of Nova Friburgo, in the mountainous region of Rio de Janeiro, Brazil. Precipitation and river level data were obtained from 5 hydrological stations (in situ), with a resolution of 15 minutes for 2 years. To quantify the propagation of uncertainties, we applied a stochastic perturbation to the input data, following an a priori defined probability distribution, and compared the statistical moments of this distribution with the statistical moments of the output distribution. Based on the proposed accuracy and precision indices, it is possible to conclude, from our case study, that the accuracy is higher but the precision is lower for uniformly distributed stochastic perturbation when compared to an equivalent triangular distribution case.

How to cite: Santos, L., Silva, E., Freitas, C., and Bacelar, R.: Uncertainties propagation in a hydrological empirical model, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-1894, https://doi.org/10.5194/egusphere-egu2020-1894, 2020.

D2380 |
Zhifeng Wu, Qifei Zhang, Yinbiao Chen, and Paolo Tarolli

Under the combined effects of climate change and rapid urbanization, the low-lying coastal cities are vulnerable to urban waterlogging events. Urban waterlogging refers to the accumulated water disaster caused by the rainwater unable to be discharged through the drainage system in time, which affected by natural conditions and human activities. Due to the spatial heterogeneity of urban landscape and the non-linear interaction between influencing factors, in this work we proposes a novel approach to characterize the urban waterlogging variation in highly urbanized areas by implementing watershed-based Stepwise Cluster Analysis Model (SCAM), which with consideration of both natural and anthropogenic variables (i.e. topographic factors, cumulated precipitation, land surface characteristics, drainage density, and GDP). The watershed-based stepwise cluster analysis model is based on the theory of multivariate analysis of variance that can effectively capture the non-stationary and complex relationship between urban waterlogging and natural and anthropogenic factors. The watershed-based analysis can overcome the shortcomings of the negative sample selection method employed in previous studies, which greatly improve the model reliability and accuracy. Furthermore, different land-use (the proportion of impervious surfaces remains unchanged, increasing by 5% and 10%) and rainfall scenarios (accumulated precipitation increases by 5%, 10%, 20%, and 50%) are adopted to simulate the waterlogging density variation and thus to clarify the future urban waterlogging-prone areas. We consider waterlogging events in the highly urbanized coastal city - central urban districts of Guangzhou (China) from 2009 to 2015 as a case study. The results demonstrate that: (1) the SCAM performs a high degree of fitting and prediction capabilities both in the calibration and validation data sets, illustrating that it can successfully be used to reveal the complex mechanisms linking urban waterlogging to natural and anthropogenic factors; (2) The SCAM provides more accurate and detailed simulated results than other machine learning models (LR, ANN, SVM), which more realistic and detailed reflect the occurrence and distribution of urban waterlogging events; (3) Under different urbanization scenarios and precipitation scenarios, urban waterlogging density and urban waterlogging-prone areas present great variations, and thus strategies should be developed to cope with different future scenarios. Although heavy precipitation can increase the occurrence of urban waterlogging, the urban expansion characterized by the increase of impervious surface abundance was the dominant cause of urban waterlogging in the analyzed study area. This study extended our scientific understanding with theoretical and practical references to develop waterlogging management strategies and promote the further application of the stepwise cluster analysis model in the assessment and simulation of urban waterlogging variation.

How to cite: Wu, Z., Zhang, Q., Chen, Y., and Tarolli, P.: Characterizing the urban waterlogging variation in highly urbanized coastal cities: A watershed-based stepwise cluster analysis model approach, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3847, https://doi.org/10.5194/egusphere-egu2020-3847, 2020.

D2381 |
| Highlight
Na Dong, Craig Robson, Stuart Barr, and Richard Dawson

Reliable transportation infrastructure is crucial to ensure the mobility, safety and economy of urban areas. Flooding in urban environments can disrupt the flow of people, goods, services and emergency responders as a result of disruption or damage to transport systems. Pervasive sensors for urban monitoring and traffic surveillance, coupled with big data analytics, provide new opportunities for managing the impacts of urban flooding through intelligent traffic management systems in real-time.

A framework has been developed to assess the effect of urban surface water on road network traffic movements, accounting for real-time traffic conditions and changes in road capacity under flood conditions. Through this framework, inferred future traffic disruptions and short-term congestions, along with their spatiotemporal prorogation can be provided to assist flood risk warning and safety guidance. Within this framework, both flood modelling results from the HiPIMS 2D hydrodynamic model, and traffic prediction from machine leaning, are integrated to enable improved traffic forecasting that accounts for surface water conditions. Information from 130 traffic counters and 46 CCTV cameras distributed over Newcastle upon Tyne (UK) are employed which include information on location, historical traffic flow, and imagery.

Figure 1 shows a flowchart of the traffic routing system. Congestion is evaluated on the basis of the level of service (LOS) value which is a function of both free flow speed and actual traffic density providing a quantitative measure for the quality of vehicle traffic service. Surface water results in decreased driving speeds which can in turn cause in a sudden increase of traffic density near the flooded road, and queuing in connected roads. A relationship among flood depth, free flow speed, flow rate and density has been constructed to examine the density curve variation in the whole process along with the surface flood dynamic. Based on the new speed-flow model and congestion degree an updated road network can be acquired using geometric calculation and network analysis. Finally, flooded traffic flows are rerouted by shortest path calculation associated with the origin-destination and changes in road capacity and vehicle speeds. A case study under a flood event similar to the one on June 28th 2012, which is a return period of 1 in 100 years, is demonstrated for Newcastle upon Tyne (UK). 

Figure 1. Flowchart of the integrated traffic routing model that accounts for surface water flooding.

How to cite: Dong, N., Robson, C., Barr, S., and Dawson, R.: A Real-time Traffic Routing Framework for Flood Risk Management Using Live Urban Observation Data, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5194, https://doi.org/10.5194/egusphere-egu2020-5194, 2020.

D2382 |
Kangling Lin, Hua Chen, Chong-Yu Xu, Yanlai Zhou, and Shenglian Guo

With the rapid growth of deep learning recently, artificial neural networks have been propelled to the forefront in flood forecasting via their end-to-end learning ability. Encoder-decoder architecture, as a novel deep feature extraction, which captures the inherent relationship of the data involved, has emerged in time sequence forecasting nowadays. As the advance of encoder-decoder architecture in sequence to sequence learning, it has been applied in many fields, such as machine translation, energy and environment. However, it is seldom used in hydrological modelling. In this study, a new neural network is developed to forecast flood based on the encoder-decoder architecture. There are two deep learning methods, including the Long Short-Term Memory (LSTM) network and Temporal Convolutional Network (TCN), selected as encoders respectively, while the LSTM was also chosen as the decoder, whose results are compared with those from the standard LSTM without using encoder-decoder architecture.

These models were trained and tested by using the hourly flood events data from 2009 to 2015 in Jianxi basin, China. The results indicated that the new neural flood forecasting networks based encoder-decoder architectures generally perform better than the standard LSTM, since they have better goodness-of-fit between forecasted and observed flood and produce the promising performance in multi-index assessment. The TCN as an encoder has better model stability and accuracy than LSTM as an encoder, especially in longer forecast periods and larger flood. The study results also show that the encoder-decoder architecture can be used as an effective deep learning solution in flood forecasting.

How to cite: Lin, K., Chen, H., Xu, C.-Y., Zhou, Y., and Guo, S.: A novel artificial neural network for flood forecasting based on deep learning encoder-decoder architecture, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-6277, https://doi.org/10.5194/egusphere-egu2020-6277, 2020.

D2383 |
Joohyung Lee, Hanbeen Kim, Taereem Kim, and Jun-Haeng Heo

Regional frequency analysis (RFA) is used to improve the accuracy of quantiles at sites where the observed data is insufficient. Due to the development of technologies, complex computation of huge data set is possible with a prevalent personal computer. Therefore, machine learning methods have been widely applied in many disciplines, including hydrology. There are also many previous studies that apply the machine learning methods to RFA. The main purpose of this study is to apply the artificial neural network (ANN) model for RFA. For this purpose, performance of RFA based on the ANN model is measured. For the homogeneous region in Han River basin, rainfall gauging sites are divided into training and testing groups. The training group consists of sites where the record length of data is more than 30 years. The testing group contains sites where the record length of data is spanned from 10 to 30 years. Various hydro-meteorological variables are used as an input layer and parameters of generalized extreme value (GEV) distribution for annual maximum rainfall data are used as an output layer of the ANN model. Then, the root mean square error (RMSE) values between the predicted and observed quantiles are calculated. To evaluate the model performance, the RMSEs of quantile estimated by the ANN model are compared to those of the index flood model.

How to cite: Lee, J., Kim, H., Kim, T., and Heo, J.-H.: Application of artificial neural network model for regional frequency analysis at Han River basin, South Korea, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-6419, https://doi.org/10.5194/egusphere-egu2020-6419, 2020.

D2384 |
Gáspár Albert and Dávid Gerzsenyi

The morphology of the Gerecse Hills bears the imprints of fluvial terraces of the Danube River, Neogene tectonism and Quaternary erosion. The solid bedrocks are composed of Mesozoic and Paleogene limestones, marls, and sandstones, and are covered by 115 m thick layers of unconsolidated Quaternary fluvial, lacustrine, and aeolian sediments. Hillslopes, stream valleys, and loessy riverside bluffs are prone to landslides, which caused serious damages in inhabited and agricultural areas in the past. Attempts to map these landslides were made and the observations were documented in the National Landslide Cadastre (NLC) inventory since the 1970’s. These documentations are sporadic, concentrating only on certain locations, and they often refer inaccurately to the state and extent of the landslides. The aim of the present study was to complete and correct the landslide inventory by using quantitative modelling. On the 480 sq. km large study area all records of the inventory were revisited and corrected. Using objective criteria, the renewed records and additional sample locations were sorted into one of the following morphological categories: scarps, debris, transitional area, stable accumulation areas, stable hilltops, and stable slopes. The categorized map of these observations served as training data for the random forest classification (RFC).

Random forest is a powerful tool for multivariate classification that uses several decision trees. In our case, the predictions were done for each pixels of medium-resolution (~10 m) rasters. The predictor variables of the decision trees were morphometric and geological indices. The terrain indices were derived from the MERIT DEM with SAGA routines and the categorized geological data is from a medium-scale geological map [1]. The predictor variables were packed in a multi-band raster and the RFC method was executed using R 3.5 with RStudio.

After testing several combinations of the predictor variables and two different categorisation of the geological data, the best prediction has cca. 80% accuracy. The validation of the model is based on the calculation of the rate of well-predicted pixels compared to the total cell-count of the training data. The results showed that the probable location of landslide-prone slopes is not restricted to the areas recorded in the National Landslide Cadastre inventory. Based on the model, only ~6% of the estimated location of the highly unstable slopes (scarps) fall within the NLC polygons in the study area.

The project was partly supported by the Thematic Excellence Program, Industry and Digitization Subprogram, NRDI Office, project no. ED_18-1-2019-0030 (from the part of G. Albert) and the ÚNKP-19-3 New National Excellence Program of the Ministry for Innovation and Technology (from the part of D. Gerzsenyi).


[1] Gyalog L., and Síkhegyi F., eds. Geological map of Hungary (scale: 1:100 000). Budapest, Hungary, Geological Institute of Hungary, 2005.

How to cite: Albert, G. and Gerzsenyi, D.: Random forest classification of morphology in the northern Gerecse (Hungary) to predict landslide-prone slopes, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8365, https://doi.org/10.5194/egusphere-egu2020-8365, 2020.

D2385 |
| Highlight
Cristina Vrinceanu, Stuart Marsh, and Stephen Grebby

Radar imagery, and specifically SAR imagery, is the preferred data type for the detection and delineation of oil slicks formed following the discharge of oil through human activities or natural occurrences. The contrast between the dark oil surfaces, characterized by a low backscatter return, and the rough, bright sea surface with higher backscatter has been exploited for decades in studies and operational processes.

Despite the semi-automatic nature of the traditional detection approaches, the workflow has always included the expertise of a trained human operator, for validating the results and efficiently discriminating between oil stained surfaces and other ocean phenomena that can produce a similar effect on SAR imagery (e.g., algal blooms, greasy ice). Thus, the process is time and resource consuming, while results are highly subjective. Therefore, automating the process to reduce the time for processing and analysis while producing consistent results is the ultimate goal.

Addressing this challenge, a new algorithm is proposed in this presentation. Building on state-of-the-art methods, the algorithm makes use of the latest technological developments for processing and analyzing features on the ocean surface using a synergistic approach combining SAR, optical and ancillary datasets. 

This presentation will focus on the results that have been obtained by ingesting high-resolution open SAR data delivered by the Copernicus Sentinel-1 satellites into the algorithm. This represents a significant advancement over traditional approaches both in terms of utilizing contemporary SAR mission imagery instead of that from the heritage missions (ERS, ENVISAT), and in deploying both conventional classification and artificial intelligence techniques (e.g. CNNs) .

The scope of this study also involves highlighting the strengths and shortcomings of each type of technique in relation to the scenario to help make recommendations on the appropriate algorithm to utilize. The full architecture of the SAR component of the algorithm will be detailed, while the case study results over a set of known seepage sites and potential candidate sites will be presented, demonstrating the reliability of this new method.

How to cite: Vrinceanu, C., Marsh, S., and Grebby, S.: Towards an automatic algorithm for natural oil slicks delineation using Copernicus Sentinel-1 imagery, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-9257, https://doi.org/10.5194/egusphere-egu2020-9257, 2020.

D2386 |
Sophie Giffard-Roisin, Saumya Sinha, Fatima Karbou, Michael Deschatres, Anna Karas, Nicolas Eckert, Cécile Coléou, and Claire Monteleoni

Achieving reliable observations of avalanche debris is crucial for many applications including avalanche forecasting. The ability to continuously monitor the avalanche activity, in space and time, would provide indicators on the potential instability of the snowpack and would allow a better characterization of avalanche risk periods and zones. In this work, we use Sentinel-1 SAR (synthetic aperture radar) data and an independent in-situ avalanche inventory (as ground truth labels) to automatically detect avalanche debris in the French Alps during the remarkable winter season 2017-18. 

Two main challenges are specific to this data: (i) the imbalance of the data with a small number of positive samples — or avalanche — (ii) the uncertainty of the labels coming from a separate in-situ inventory. We propose to compare two different deep learning methods on SAR image patches in order to tackle these issues: a fully supervised convolutional neural networks model and an unsupervised approach able to detect anomalies based on a variational autoencoder. Our preliminary results show that we are able to successfully locate new avalanche deposits with as much as 77% confidence on the most susceptible mountain zone (compared to 53% with a baseline method) on a balanced dataset.

In order to make an efficient use of remote sensing measurements on a complex terrain, we explore the following question: to what extent can deep learning methods improve the detection of avalanche deposits and help us to derive relevant avalanche activity statistics at different scales (in time and space) that could be useful for a large number of users (researchers, forecasters, government operators)?

How to cite: Giffard-Roisin, S., Sinha, S., Karbou, F., Deschatres, M., Karas, A., Eckert, N., Coléou, C., and Monteleoni, C.: Detecting avalanche debris from SAR imaging: a comparison of convolutional neural networks and variational autoencoders, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-9487, https://doi.org/10.5194/egusphere-egu2020-9487, 2020.

D2387 |
Matthias Hort, Daniel Uhle, Fabio Venegas, Lea Scharff, Jan Walda, and Geoffroy Avard

Immediate detection of volcanic eruptions is essential when trying to mitigate the impact on the health of people living in the vicinity of a volcano or the impact on infrastructure and aviation. Eruption detection is most often done by either visual observation or the analysis of acoustic data. While visual observation is often difficult due to environmental conditions, infrasound data usually provide the onset of an event. Doppler radar data, admittedly not available for a lot of volcanoes, however, provide information on the dynamics of the eruption and the amount of material released. Eruptions can be easily detected in the data by visual analysis and here we present a neural network approach for the automatic detection of eruptions in Doppler radar data. We use data recorded at Colima volcano in Mexico in 2014/2015 and a data set recorded at Turrialba volcano between 2017 and 2019. In a first step we picked eruptions, rain and typical noise in both data sets, which were the used for training two networks (training data set) and testing the performance of the network using a separate test data set. The accuracy for classifying the different type of signals was between 95 and 98% for both data sets, which we consider quite successful. In case of the Turriabla data set eruptions were picked based on observations of OVSICORI data. When classifying the complete data set we have from Turriabla using the trained network, an additional 40 eruptions were found, which were not in the OVSICORI catalogue.

In most cases data from the instruments are transmitted to an observatory by radio, so the amount of data available is an issue. We therefore tested by what amount the data could be reduced to still be able to successfully detect an eruption. We also kept the network as small as possible to ideally run it on a small computer (e.g. a Rasberry Pi architecture) for eruption detection on site, so only the information that an eruption is detected needs to be transmitted.

How to cite: Hort, M., Uhle, D., Venegas, F., Scharff, L., Walda, J., and Avard, G.: Automatic detection of volcanic eruptions in Doppler radar observations using a neural network approach, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-11123, https://doi.org/10.5194/egusphere-egu2020-11123, 2020.

D2388 |
Mohammad Mehdi Bateni, Mario Martina, and Luigi Cesarini

The field of information theory originally developed within the context of communication engineering, deals with the quantification of the information present in a realization of a stochastic process. Mutual information is a measure of mutual dependence between two variables and can be determined from marginal and joint entropies. It is an efficient tool to investigate linear or non-linear dependencies. In this research, some transformed variables, each based on rainfall data from different datasets in Dominican Republic, are adopted in Neural Network and SVM models to classify flood/no-flood events. A selection procedure is used to select skillful inputs to the flood detection model. The relationship between the flood/no-flood output datasets and each predictor (relevance) and also among predictors (redundancy) were assessed based on the mutual information metric. The minimum redundancy between predictors and maximum relevance to the predictand is targeted in order to choose a set of appropriate predictors. 

How to cite: Bateni, M. M., Martina, M., and Cesarini, L.: Predictor dataset selection method for construction of ML-based Models in flood detection using mutual information, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19060, https://doi.org/10.5194/egusphere-egu2020-19060, 2020.

D2389 |
Pierre Lepetit, Cécile Mallet, and laurent Barthes

The road traffic is highly sensitive to weather conditions. Accumulation of snow on the road can cause important safety problems. But road conditions monitoring is as hard as critical: in mid-latitude countries, on the one hand, the spatial variability of snowfall is high and on the other hand, accurate characterization of snow accumulation mainly relies on costly sensors.

In recent decades, webcams have become ubiquitous along the road network. The quality of these webcams is variable but even low-resolution images capture information about the extent and the thickness of the snow layer. Their images are also currently used by forecasters to refine their analysis. The automatic extraction of relevant meteorological information is hence very useful.

Recently, generic and efficient computer vision methods have emerged. Their application to image-based weather estimation has become an attractive field of research. However, the scope of existing work is generally limited to high-resolution images from one or a few cameras.

In this study, we show that for a moderate effort of labelling, recent Machine Learning approaches allow us to predict quantitative indices of the snow depth for a large variety of webcam settings and illumination.

Our approach is based on two datasets. The smallest one contains about 2.000 images coming from ten webcams that were set up near sensors devoted to snow depth measurements.

The largest one contains 20,000 images coming from 200 cameras of the AMOS dataset. Meteorological standard rules of human observation and the specifics of the webcams have been taken into account to manually label each image. These labels are not only about the thickness and the extent of the snow layer but also describe the precipitation (rain or snow, presence of streaks), the optical range and the foreground noise. Both datasets contain night images (45%) and at least 15% of images corrupted by foreground noise (filth, droplets, and snowflakes on the lens).


The labels of the AMOS subset allowed us to train ranking models for snow depth and visibility using a multi-task setting. The models are then calibrated on the smallest dataset. We tested several versions, built from pre-trained CNNs (ResNet152, DenseNet161, and VGG16).

Results are promising with up to 85% accuracy for comparison tasks, but a 10% decrease can be observed when the test webcams have not been used during the training phase.

A case study based on a widespread snow event over the French territory will be presented. We will show the potential of our method through a comparison with operational model forecasts.

How to cite: Lepetit, P., Mallet, C., and Barthes, L.: Deep Learning for image based weather estimation: a focus on the snow, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-19802, https://doi.org/10.5194/egusphere-egu2020-19802, 2020.

D2390 |
Lie Sun, Le Wu, Fei Xu, and ZhanLong Song

The lack of the ability for machines to understand and judge semantic knowledge in the field of emergency response decision-making for marine environment safety is one of the difficulties in intelligent emergency response of marine disaster. Taking advantage of knowledge graphs in semantic search and intelligent recommendation is an important goal for the construction of the marine environment safety knowledge base. We summarizes the knowledge representation method based on knowledge graphs, analyzes the characteristics and difficulties of knowledge representation for emergency decision-making of marine environment safety, constructs the knowledge system of marine environment safety knowledge base, and proposes the construction idea of ​​marine environment safety knowledge base based on knowledge graphs.

How to cite: Sun, L., Wu, L., Xu, F., and Song, Z.: Discussion on The Construction Technology of Marine Environment Safety Knowledge Based on Knowledge Graphs, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-20991, https://doi.org/10.5194/egusphere-egu2020-20991, 2020.

D2391 |
Jorge Duarte, Pierre E. Kirstetter, Manabendra Saharia, Jonathan J. Gourley, Humberto Vergara, and Charles D. Nicholson

Predicting flash floods at short time scales as well as their impacts is of vital interest to forecasters, emergency managers and community members alike. Particularly, characteristics such as location, timing, and duration are crucial for decision-making processes for the protection of lives, property and infrastructure. Even though these characteristics are primarily driven by the causative rainfall and basin geomorphology,  untangling the complex interactions between precipitation and hydrological processes becomes challenging due to the lack of observational datasets which capture diverse conditions.

This work follows upon previous efforts on incorporating spatial rainfall moments as viable predictors for flash flood event characteristics such as lag time and the exceedance of flood stage thresholds at gauged locations over the Conterminous United States (CONUS). These variables were modeled by applying various supervised machine learning techniques over a database of flood events. The data included morphological, climatological, streamflow and precipitation data from over 21,000 flood-producing rainfall events – that occurred over 900+ different basins throughout the CONUS between 2002-2011. This dataset included basin parameters and indices derived from radar-based precipitation, which represented sub-basin scale rainfall spatial variability for each storm event. Both classification and regression models were constructed, and variable importance analysis was performed in order to determine the relevant factors reflecting hydrometeorological processes. In this iteration, a closer look at model performance consistency and variable selection aims to further explore rainfall moments’ explanatory power of flood characteristics. 

How to cite: Duarte, J., Kirstetter, P. E., Saharia, M., Gourley, J. J., Vergara, H., and Nicholson, C. D.: Predicting flood responses from spatial rainfall variability and basin morphology through machine learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-22179, https://doi.org/10.5194/egusphere-egu2020-22179, 2020.