
Environmental Studies and Forestry
Detection of untreated sewage discharges to watercourses using machine learning
P. Hammond, M. Suttie, et al.
This research, conducted by Peter Hammond, Michael Suttie, Vaughan T. Lewis, Ashley P. Smith, and Andrew C. Singer, introduces innovative machine learning techniques to identify untreated sewage spills from wastewater treatment plants with over 96% accuracy. The findings reveal significant potential non-compliance in effluent discharges, providing valuable insights for water management and regulatory oversight.
~3 min • Beginner • English
Introduction
The study addresses under-reporting of untreated sewage discharges from wastewater treatment plants (WWTPs) in England, where a substantial portion of incidents are reported by the public rather than operators. This gap hampers effective regulation, investment, and enforcement. The authors propose using machine learning (ML) on available operational data—treated effluent flow patterns, event duration monitoring (EDM) data for storm tank overflows, rainfall, river flow/level, and WWTP telemetry alarms—to detect putative untreated sewage spills, including those unreported by operators. The objective is to train classifiers on known EDM-confirmed spill events to identify similar flow perturbations historically and thereby improve oversight, compliance checking, and environmental protection.
Literature Review
The paper situates its contribution within regulatory oversight and environmental compliance monitoring. Prior AI applications have involved symbolic rule-based systems for legal/regulatory reasoning and more recent ML approaches for compliance prediction using large public datasets (e.g., US EPA ECHO). However, the authors note a lack of studies applying ML to WWTP effluent flow patterns combined with rainfall and telemetry data over extended time series for detecting untreated sewage discharges. This work fills that gap by leveraging EDM-validated spills for supervised learning and integrating corroborative data sources.
Methodology
- Study sites and data: Two anonymized WWTPs (WWTP1 population equivalent 7,594; WWTP2 47,000) in England, both recording treated effluent flow (15-min MCERTS data) from 2009–2020 (>8,000 daily patterns). EDM data (start/stop times for storm tank overflows) were available for 2018–2020 (~900 days). Telemetry alarms included Consented Overflow Level (COL) and Storm Tank Overflow (STO) logs. Public rainfall, river flow, and level data were obtained.
- Preprocessing: Daily flow patterns comprised 96 values (15-min averages). Days with substantial missing data were excluded. For days with EDM, spill intervals were aggregated to hours per day. Days with ≥3 h of EDM-recorded spilling were labeled 'spill'; <3 h labeled 'normal'. Days without EDM were 'unknown'.
- Shape analysis: Each daily flow curve (96 points) was converted to a triangulated 'ribbon' with 192 landmarks. Dense Surface Models (DSM) were built per WWTP (2016–2020) using PCA to capture 99% variance. PCA1 captured magnitude and temporal shift of diurnal peaks; PCA2 captured spill-related flattening. Receiver operating characteristic (ROC) using PCA2 alone yielded AUC 0.88 (WWTP1) and 0.91 (WWTP2).
- Supervised learning: Support Vector Machines with radial basis function kernels were trained on 2018–2020 labeled data (post-EDM recommissioning in late 2018). Twenty SVM variations (five margin heuristics × four parameterizations) were evaluated using stratified 20-fold cross-validation. Classification features were PCA mode weights; the number of retained modes was tuned. Optimal ensembles used three best algorithms (median, Jaakkola, Jaakkola-mean kernels) and retained 2 PCA modes (WWTP1) and 10 (WWTP2). A day was labeled 'spill' only if all three agreed. Cross-validated AUCs: 0.97 (WWTP1), 0.96 (WWTP2).
- Verification and application: The optimal classifiers were verified against the 2018–2020 data (training/verification), then semi-blinded application to 2016–2018 (used in shape analysis but not training), and fully blinded application to 2009–2015.
- Corroboration: ML-detected spills were corroborated where possible using COL and STO alarms and contextualized with rainfall/river data. Agreement among EDM, COL, and ML was quantified using Cohen’s kappa; STO reliability varied by site and period.
- Additional analyses: Detection and ranking of isolated and contiguous 24-h spills using standard deviation (flattening) and mean flow relative to storm overflow rate to assess potential compliance with minimum pass-forward-to-treatment (PFF) conditions.
Key Findings
- Classification performance: Cross-validated AUCs of optimal SVM classifiers were 0.97 (WWTP1) and 0.96 (WWTP2). Using PCA2 alone achieved AUC 0.88 (WWTP1) and 0.91 (WWTP2). On 2018–2020 EDM-labeled verification data, sensitivity/specificity were 0.91/0.95 (WWTP1) and 0.98/0.98 (WWTP2).
- Detected unreported spills: Among 7,160 days without operator-reported spills, 926 were classified as 'spill'. Semi-blinded (2016–2018) and fully blinded (2009–2015) applications identified numerous putative events.
- Corroboration: Strong agreement between ML and COL/EDM for WWTP1 Feb 2019–Feb 2020 (near-perfect kappa 0.81–1.00). For Dec 2018–Jan 2019, EDM and COL agreed (kappa 0.95), STO disagreed (kappa ≤ 0). WWTP2 showed high agreement between COL and EDM (kappa 0.87) and with ML (kappa 0.78); STO showed only chance agreement (kappa < 0.1).
- Historical corroboration counts (2009–2018): ML-detected spills corroborated by alarms included 327 days at WWTP1 and 128 at WWTP2 (where alarm data were available), with COL generally reliable and STO unreliable at WWTP2.
- 24-h spills and series: ML detected >160 24-h spills at WWTP1 (105 corroborated) and ~200 at WWTP2, including long contiguous runs (>10 days) and a notable ~60-day near-continuous spill at WWTP2 (21/12/2013–22/02/2014). Many days within these series had rainfall ≤2 mm, indicating possible groundwater ingress. In Nov 2019, WWTP2 had ~30 days of near-continuous spilling associated with extensive sewage fungus.
- Compliance insights: WWTP1 showed many 24-h spills with mean effluent between 60–80% of the storm overflow rate, suggesting early (non-compliant) spilling over >12 years. WWTP2’s most flattened days were generally at or above the storm overflow rate (closer to compliance), though two 24-h 'dry spills' in May 2012 occurred without rainfall, consistent with unpermitted groundwater ingress.
- Operator EDM reports (2019): WWTP1 spilled >1000 h over 72 days (mean 15 h/day; 21 ML-detected 24-h spills; contiguous runs 2–11 days). WWTP2 spilled >1390 h over 76 days (mean 18.3 h/day; 32 ML-detected 24-h spills; runs 2–14 days).
- Regulatory relevance: The analysis suggests both WWTPs made non-compliant discharges between 2009 and 2020 and that WWTP2 has likely experienced groundwater ingress driving prolonged spills for at least nine years.
Discussion
The ML approach effectively links characteristic flattening of daily effluent flow patterns to untreated sewage spills, enabling retrospective detection of unreported discharges. High classifier performance, coupled with strong agreement with COL and EDM alarms, demonstrates robustness. The findings reveal pervasive and sometimes prolonged spill events, including non-compliant early spills (below minimum pass-forward-to-treatment rates) and extended series under unexceptional rainfall, consistent with groundwater ingress. This has significant implications for regulatory oversight, environmental protection, and public health, given the potential ecological impacts (e.g., sewage fungus over kilometers of watercourse). The approach can support operators in diagnosing asset malfunctions and capacity issues, aid regulators in detecting permit non-compliance, and empower citizen science oversight. Broader application could help explain persistent poor status of many surface water bodies and inform planning decisions that would otherwise exacerbate underperforming catchments.
Conclusion
The study demonstrates a proof-of-principle ML framework that, trained on EDM-validated events, reliably detects unreported untreated sewage spills from WWTP effluent flow patterns. Applied over an 11-year period at two WWTPs, it identified 926 putative spill days, hundreds of day-long events, and prolonged contiguous series, with corroboration from telemetry alarms and contextual rainfall/river data. The work highlights likely non-compliance (early spilling) and suggests sustained groundwater ingress at one site. The authors advocate for real-time, open access to flow and alarm data to enhance professional and citizen oversight. Future work will extend ML to predictive detection using river quality parameters from multi-parameter sondes upstream/downstream of WWTPs and explore transferability/customization of models across WWTP types.
Limitations
- Data dependency: Reliable performance requires accurate, abundant EDM and flow data; initial EDM deployments were unreliable and recommissioned (Nov/Dec 2018), limiting training data.
- Transferability: Cross-plant classifier transfer performed poorly despite normalization, suggesting models may need customization per WWTP or per class of similar plants.
- Measurement gaps: Permits require minimum pass-forward-to-treatment but often do not require direct measurement/recording of treatment flow; effluent flow used as a proxy may limit compliance inference.
- Data access: Heterogeneous, protracted EIR processes and incomplete electronic records hinder timely data acquisition and standardization; STO alarms proved unreliable at times.
- Environmental confounders: Groundwater ingress effects can be temporally lagged relative to rainfall and geologically variable, complicating causal attribution using rainfall/river data alone.
Related Publications
Explore these studies to deepen your understanding of the subject.