Developing performance checks on machine-learning models in an automated system for developing hazard maps

Ashleigh Massam; Ashley Barnes; Siân Lane; Robert Platt; David Wood

doi:https://doi.org/10.5194/egusphere-egu2020-5372

[Back] [Session ITS4.1/NP4.2]

EGU2020-5372, updated on 22 Dec 2023

https://doi.org/10.5194/egusphere-egu2020-5372

EGU General Assembly 2020

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Developing performance checks on machine-learning models in an automated system for developing hazard maps

Ashleigh Massam, Ashley Barnes, Siân Lane, Robert Platt, and David Wood

Ashleigh Massam et al.

JBA Risk Management, Skipton, United Kingdom (ashleigh.massam@jbarisk.com)

JBA Risk Management (JBA) uses JFlow®, a two-dimensional hydraulic model, to simulate surface water, fluvial, and dam break flood risk. National flood maps are generated on a computer cluster that parallelises up to 20,000 model simulations , covering an area of up to 320,000 km3 and creating up to 10 GB of data per day.

JBA uses machine-learning models to identify artefacts in the flood simulations. The ability of machine-learning models to quickly process and detect these artefacts, combined with the use of an automated control system, means that hydraulic modelling throughput can be maximised with little user intervention. However, continual retraining of the model and application of software updates introduce the risk of a significant decrease in performance. This necessitates the use of a system to monitor the performance of the machine-learning model to ensure that a sufficient level of quality is maintained, and to allow drops in quality to be investigated.

We present an approach used to develop performance checks on a machine-learning model that identifies artificial depth differences between hydraulic model simulations. Performance checks are centred on the use of control charts, an approach commonly used in manufacturing processes to monitor the proportion of items produced with defects. In order to develop this approach for a geoscientific context, JBA has (i) built a database of randomly-sampled hydraulic model outputs currently totalling 200 GB of data; (ii) developed metrics to summarise key features across a modelled region, including geomorphology and hydrology; (iii) used a random forest regression model to identify feature dominance to determine the most robust relationships that contribute to depth differences in the flood map; and (iv) developed the performance check in an automated system that tests every nth hydraulic modelling output against data sampled based on common features.

The implementation of the performance checks allows JBA to assess potential changes in the quality of artificial feature identification following a training cycle in a development environment prior to release in a production environment.

How to cite: Massam, A., Barnes, A., Lane, S., Platt, R., and Wood, D.: Developing performance checks on machine-learning models in an automated system for developing hazard maps, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-5372, https://doi.org/10.5194/egusphere-egu2020-5372, 2020.

This abstract will not be presented.