- 1York University, Dahdaleh Institute for Global Health Research, Toronto, Canada (desantim@yorku.ca)
- 2York University, Lassonde School of Engineering, Department of Civil Engineering, Toronto, Canada
- 3Tufts University School of Engineering, Medford, Massachussets, United States of America
- 4Oxfam International, Kyaka II, Uganda
- 5Public Health Department, MSF Amsterdam, The Netherlands
Unprecedented global population displacement in recent years has increased the burden of waterborne illnesses in refugee and internally displaced person (IDP) settlements. Unlike contexts where water is piped directly to the home, in urban-scale refugee and IDP settlements, water users manually collect water from public tapstands and transport it to their dwellings where they store and use it over several hours. This creates the potential for recontamination, increasing waterborne illness risk. Humanitarian responders need to optimize water treatment to minimize waterborne illness risk at the household. Quantitative microbial risk assessment (QMRA) has been used to assess health risk from drinking water in a variety of contexts. However, conventional QMRA approaches rely on pathogen enumeration data, which is too slow, expensive, and logistically challenging to respond to rapid fluctuations in water quality (WQ) in humanitarian contexts.
We propose a novel hybrid machine learning (ML)-QMRA approach that links operational WQ data to QMRA using probabilistic ML models for responsive risk assessments. The ML-QMRA model uses a two-stage probabilistic ML approach: first we forecast WQ from tapstand to household via a deep composite quantile regression neural network (DCQRNN) and then we link household WQ to E.coli data using a support vector quantile regression (SVQR) model. This predicted E. coli becomes an input to an QMRA model designed based on WHO QMRA guidelines.
We tested this ML-QMRA modelling approach using operational WQ data from the Kyaka II refugee settlement in Uganda to assess daily probabilities of infection for pathogenic E. coli and rotavirus. The ML-QMRA model forecasted a mean infection risk for pathogenic E. coli ranged of 4.5x10-2 and 0.19x10-4 for rotavirus. The ML-QMRA model also determined that to keep the risk of infection from pathogenic E. coli within 5% of the minimum daily risk of infection, 0.8 mg/L of FRC was needed at the tapstand at a turbidity of 1 NTU. The FRC requirement increased with turbidity, up to 1.25 mg/L at a turbidity of 20 NTU. This water quality was also sufficient to manage rotavirus infection risk.
Our study shows how hybridizing process-based QMRA health risk assessment with probabilistic ML models can enable integration of operational data for more rapid risk assessment than conventional approaches using pathogen data. The ML-QMRA model also enables us to set multi-parameter water quality targets for routine monitoring data that are based on health-risk, not arbitrary guidelines. The ML-QMRA approach has applications in a range of contexts outside of humanitarian contexts in urban water management to make QMRA more responsive to rapid WQ fluctuations.
How to cite: De Santi, M., Ali, S. I., Khan, U. T., Brown, J. E., Heylen, C., String, G., Naliyongo, D., Ogira, V., Lantagne, D., Fesselet, J.-F., and Orbinski, J.: Rapid and responsive water quality risk assessment using a hybrid machine learning integrated quantitative microbial risk assessment model, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3709, https://doi.org/10.5194/egusphere-egu25-3709, 2025.