Engineering and Technology

Flow virometry for water-quality assessment: protocol optimization for a model virus and automation of data analysis

H. R. Safford, M. M. Johnson, et al.

Discover how Hannah R. Safford, Melis M. Johnson, and Heather N. Bischel are revolutionizing water quality monitoring with flow virometry! Their study optimizes T4 bacteriophage detection protocols and introduces an innovative automated data analysis pipeline, enhancing the reliability and efficiency of microbial assessments in wastewater.... show more

Introduction

Water reuse is increasingly essential to meeting water demand. While nonpotable and indirect potable reuse are established, direct potable reuse (DPR) faces concerns regarding feasibility, cost, safety, and societal acceptance. Enhanced monitoring of water-borne microorganisms is needed to support DPR. Flow cytometry (FCM), including its virus-focused application flow virometry (FVM), has emerged as a promising method for water-quality assessment, yet challenges remain, especially for reliably detecting and enumerating viruses in environmental samples. Differentiating viruses from similarly sized virus-like particles (VLPs) lacks widely accepted methods, and absolute quantification is hindered by the absence of proper standards. Literature suggests that well-characterized, rigid, nonenveloped viruses could serve as standards, but consistent, optimized protocols are needed to enable interlaboratory comparison. Current limitations also include insufficient sensitivity and accuracy of FVM for many natural viral populations, especially smaller enteric viruses, indicating a need for advances in hardware and dyes. In the interim, improved protocols can expand the set of viruses detectable by FVM. Prior protocols (e.g., Brussaard et al.; Huang et al.) used a pipeline optimization approach, which can miss factor interactions and be inefficient. Additionally, FVM data are commonly analyzed via manual gating, which is time-consuming and subject to substantial interlaboratory variability. This study addresses two needs: (1) employing a fractional factorial design to optimize staining and sample preparation for FVM-based detection of T4 bacteriophage, an environmentally relevant and easy-to-use surrogate, and (2) testing density-based clustering (OPTICS) as an automated alternative to manual gating for analyzing complex FVM datasets. The combined approach is demonstrated on tertiary-treated wastewater augmented with viral surrogates.

Literature Review

The paper reviews key developments and gaps in applying FCM/FVM to water-quality monitoring. It highlights difficulties in distinguishing viruses from VLPs such as exosomes and microvesicles, and the lack of widely accepted differentiation or absolute quantification methods. Lippé (2018) emphasized the need for proper standards, recommending well-characterized nonenveloped viruses as potential FVM standards. Dlusskaya et al. (2021) reported that FVM is not yet sufficiently sensitive or accurate for most natural viral populations in wastewater, particularly small enteric viruses, indicating the need for improved hardware and dyes. Brussaard et al. developed and refined a nucleic-acid staining protocol for FVM that has been widely applied, but Huang et al. found it inadequate for clear virus/noise separation in reclaimed-water samples and proposed modifications. Prior protocol developments commonly used sequential (pipeline) optimization, which may miss interaction effects among factors. The paper also notes that FVM analyses typically rely on manual gating, which is time-consuming and yields high interlaboratory variability, motivating exploration of automated, objective clustering approaches such as OPTICS.

Methodology

Overview: The study comprises (i) optimization of an FVM staining/sample-preparation protocol for bacteriophage T4 using a fractional factorial experimental design and (ii) evaluation of density-based clustering (OPTICS) versus manual gating for mixed-target and environmental-spike datasets. Phage materials: T4 bacteriophage (ATCC 11303-B4) with host Escherichia coli (ATCC 11303) were propagated from freeze-dried specimens. φ6 bacteriophage (HB104) with host Pseudomonas syringae were provided as stocks. Aliquots (hosts with 25% glycerol; phages) were stored at −80 °C. Purified, high-titer phage stocks were prepared per Bonilla et al. (2016). Negative control stocks were prepared identically without phage. Stock aliquots were prepared by 100× dilution in Milli-Q (MQ) water or Tris-EDTA (TE) buffer; subsets were fixed with 0.5% glutaraldehyde (15 min at 4 °C) and stored at −80 °C. Titers were assessed by plate-based culture and (RT-)qPCR. Flow cytometer and general acquisition: FVM was performed on a NovoCyte 2070V (Agilent) with 488 nm laser and autosampler. FITC channel (green fluorescence) at 530 ± 30 nm; FSC and SSC also collected. For all runs, 10 µL sample was analyzed at 5 µL/min with a FITC threshold of 800. Between samples, the instrument was flushed with 150 µL 1× NovoClean then 150 µL MQ water at 120 µL/min. Instrument QC was performed at least monthly. Staining reagents: SYBR Green I and SYBR Gold (ThermoFisher, 10,000× in DMSO) working stocks were prepared in TE and stored at −20 °C. Optimization design (fractional factorial): A 2^(6−2) fractional factorial design (4 replicated rounds; randomized run order) tested six two-level factors: (1) stain type (SYBR Green I vs SYBR Gold), (2) diluent (MQ vs TE), (3) dye concentration (final: 5 × 10^−5 vs 1 × 10^−4 of commercial stock), (4) staining temperature (25 °C vs 50 °C), (5) staining time (1 min vs 15 min), and (6) glutaraldehyde addition (0 vs 0.5% final). Factors were assigned to avoid confounding key main effects with likely interactions. For each experiment, thawed T4 stocks were further diluted 10× in the designated medium prior to staining; incubations after stain addition were in the dark; elevated temperatures used a water bath. Data gating for optimization: events were bounded at 0 ≤ SSC ≤ 1000 and 800 ≤ FITC ≤ 10,000. Unstained control event counts were subtracted from corresponding stained-sample counts. FlowJo (v10) created pseudocolor plots and set FITC peak-based gates (“Create Gates on Peaks”). Target particle number, mean fluorescence intensity (MFI), and fluorescence coefficient of variation (CV) were computed. Statistical analysis of main and two-way interaction effects used the FrF2 package in RStudio (2021.09.01); analyses were conducted on (i) all events from all runs and (ii) target-only events from glutaraldehyde-treated runs. Mixed-target experiment: A TE solution contained known concentrations of submicron targets: T4 and φ6 bacteriophages and fluorescent polystyrene beads (0.2 µm, 0.5 µm, 0.8 µm). T4 served as a detectable viral surrogate; φ6 represented faint/indeterminate VLP signals; beads provided uniform engineered standards. Phage stocks were treated per the optimized protocol but stained with SYBR Gold (φ6 is an RNA virus). Preparation: 20 µL T4 stock (10^−3), 20 µL φ6 stock (10^−3), 1 µL 0.2-µm bead suspension, 2 µL 0.5-µm bead suspension, and 15 µL PBS. Serial dilutions at 2×, 4×, 8×, 16× were generated; 4 µL of 0.8-µm beads was added to each dilution at constant concentration. Ten replicates were analyzed per dilution. Environmental-spike experiment: Tertiary-treated effluent from UC Davis Wastewater Treatment Plant was 0.2-µm syringe-filtered, diluted 10× in MQ, and spiked with the T4/bead solution (same as mixed-target but without φ6 and without 0.5-µm beads). An unspiked (negative) solution was prepared identically. Ten replicates of spiked and unspiked solutions were analyzed. Data analysis approaches:

Manual gating (MG): Gates were defined using SSC vs FITC log–log pseudocolor density plots from the 1× mixed-target data and applied to all mixed-target and environmental-spike datasets.
OPTICS ordering + manual extraction (O:ME): FSC, SSC, and FITC were log-transformed, then standardized (mean-centered, unit variance). OPTICS (dbscan package in R; Euclidean distance) used k = 2 × dimensionality = 6; epsilon bounded at 0.1 to reduce computation time. Reachability plots were inspected in MATLAB (R2021a) for manual selection of clusters by identifying valleys separated by peaks.
OPTICS ordering + opticskxi automated extraction (O:kxi): opticskxi (R) iteratively identified steepness differences; maximum iterations = 1000; maximum clusters k = 6 (mixed-target) and k = 4 (environmental-spike). MinPts: for mixed-target, 8000 at 1× dilution and halved at each subsequent dilution; for environmental-spike, MinPts = 8000. Additional analyses included OPTICS orderings using only SSC and FITC (excluding FSC) for environmental-spike data to assess dimensional weighting effects. Data synthesis: For numerical comparison across methods in the mixed-target study, detected events were grouped into four buckets: (1) viruses (T4, φ6, VLPs), (2) 0.2-µm beads, (3) 0.5-µm beads (including doublets), and (4) 0.8-µm beads. Expected versus detected counts were compared across MG, O:ME, and O:kxi.

Key Findings

Optimization results:

Glutaraldehyde addition (0.5% final) was essential to reveal a distinct target population and had highly significant effects (p < 0.001) on overall performance: increased total event count by 65,402 events, increased MFI by 360 units, and decreased fluorescence CV by 9 percentage points. Control experiments indicated that these increases were not due to phantom events or elevated background but reflected enhanced target detection.
Target event counts for glutaraldehyde-treated runs were ~10^9–10^10 events/mL, higher than qPCR-based titers (10^8–10^9 gc/mL) and culture-based titers (10^7–10^8 PFU/mL). Discrepancies likely stem from non-specific staining (FVM), DNA extraction losses (qPCR), and culture-based biases.
In target-only analysis (glutaraldehyde-treated runs), no statistically significant two-way interactions were detected among non-glutaraldehyde factors. Diluent had a significant main effect on target event count: using TE (vs MQ) reduced counts by 7,807 events (p = 0.023).
Strongly significant main effects on CV (p < 0.001): staining at 50 °C decreased CV by 2.7 percentage points; using TE buffer decreased CV by 4.4 points. Additional effects: higher stain concentration (1 × 10^−4) increased CV by 1.8 points (0.001 < p < 0.01); longer staining time (15 min) decreased CV by 1.2 points (0.01 < p < 0.05); SYBR Gold (vs SYBR Green I) increased CV by 1.5 points (0.01 < p < 0.05).
Recommended protocol for T4 detection by FVM: dilute in TE buffer to achieve ~10^2–10^3 events/s acquisition rate; add glutaraldehyde to 0.5% final; stain with SYBR Green I at the lower of the two tested final concentrations (aligned with 5 × 10^−5 of stock), incubate at 50 °C for at least 1 min. Clustering vs manual gating (mixed-target):
For engineered beads (0.2, 0.5, 0.8 µm), OPTICS (both manual and opticskxi extraction) agreed well with manual gating across dilutions. OPTICS often revealed two subclusters within the apparent 0.2-µm bead region (similar SSC/FITC but different FSC), a feature not detected by manual gating.
Bucket-level comparisons showed detected counts for 0.2 and 0.5 µm beads exceeded theoretical expectations, 0.8 µm slightly below; virus bucket much lower than expected. Bead discrepancies reflect approximate manufacturer concentrations (order-of-magnitude accuracy); virus undercounting reflects φ6’s faint FITC signal and limit of detection.
For virus bucket, MG and O:kxi produced similar, generally higher counts than O:ME. Differences arose because viral clusters were diffuse with gradual reachability gradients; opticskxi tended to assign high-reachability points on the same curve to the T4 cluster, while manual extraction often labeled such points as noise. Clustering vs manual gating (environmental-spike):
Manual gating identified 0.2- and 0.8-µm bead clusters and, in T4-spiked samples, a T4 cluster partly obscured by wastewater background but within the predefined gate. A recurrent low-SSC, high-FITC cluster of unknown identity was observed.
O:ME consistently detected the bead clusters and subclusters (0.2-µm region with distinct FSC) and background clusters but often did not clearly isolate T4 nor detect the foreign low-SSC/high-FITC cluster.
O:kxi, constrained by preset cluster number, typically produced a cluster encompassing 0.2-µm beads (with many noise points) and a combined T4/VLP/background cluster, occasionally merging into the 0.2-µm bead region and never isolating T4 as a distinct cluster. OPTICS using only SSC+FITC simplified reachability plots but did not materially improve T4 detection.

Discussion

The study demonstrates that a fractional factorial design can efficiently identify critical sample-preparation parameters for FVM, yielding a blended protocol that improves detection of the T4 bacteriophage over prior pipeline-optimized methods. Glutaraldehyde is pivotal for clearly resolving the target population and improving fluorescence signal quality (lower CV). TE buffer and elevated staining temperature (50 °C) substantially tighten the fluorescence distribution, facilitating discrimination from background. The optimized protocol provides a practical standard operating procedure to validate instrument performance and establish expected signal ranges using a widely available viral surrogate. On data analysis, density-based clustering using OPTICS can match or exceed manual gating for well-defined, dense clusters (engineered beads) and can reveal structure (e.g., FSC-based subclusters) not evident via manual gating. For viral targets in clean and complex matrices, performance is mixed: MG reliably separated T4 from φ6/VLPs in mixed-target data, but OPTICS-based extractions struggled to consistently isolate T4 from diffuse background, especially in wastewater matrices. Equal weighting of FSC, SSC, and FITC in OPTICS may not be optimal for viruses where information resides primarily in FITC; tailoring dimensional weighting and parameter selection could improve clustering outcomes. The authors suggest using OPTICS to assist gate definition in clean reference samples and transferring those gates to complex samples where automated clustering falters. Adopting a consistent, optimized protocol centered on T4 as a biological standard alongside non-biological bead standards, combined with objective clustering workflows, can enhance interlaboratory comparability, speed, and reliability of FVM for microbial water-quality monitoring.

Conclusion

This work contributes: (1) a rigorously optimized FVM staining and preparation protocol for the T4 bacteriophage, derived via fractional factorial design, that improves target resolution and signal quality; and (2) an evaluation of density-based clustering (OPTICS) as a faster, objective alternative or complement to manual gating, especially effective for engineered bead standards and capable of uncovering data features missed by manual methods. The authors recommend incorporating the T4-based protocol as a routine validation step for FVM experiments and pairing biological standards with non-biological beads to facilitate cross-lab comparisons. Future research should refine automated clustering for virus-rich, complex matrices by optimizing OPTICS parameters, enhancing automated cluster extraction, and strategically weighting dimensions (e.g., emphasizing FITC). Using clustering to inform gate placement in clean samples for application to complex samples is another promising pathway.

Limitations

The optimized protocol was developed for a single virus (T4) in a clean matrix under controlled laboratory conditions; it is not a universal protocol for all waterborne viruses or environmental matrices. Glutaraldehyde fixation, while improving detectability, precludes certain validation workflows (e.g., sorting and culture verification) and may alter sample properties. The study did not detect significant interaction effects among non-glutaraldehyde factors in target-only analyses, but only a subset of potential interactions was assessed. In complex wastewater matrices, density-based clustering struggled to consistently isolate T4, sometimes merging viral signals with beads or background. OPTICS’ equal weighting of FSC, SSC, and FITC may be suboptimal for virus detection, and optimal parameter settings remain dataset-dependent. Single-particle validation of FVM identifications is inherently challenging, and discrepancies between FVM, qPCR, and culture titers reflect methodological biases. An observed low-SSC, high-FITC cluster in environmental samples had unknown identity.

Related Publications

Explore these studies to deepen your understanding of the subject.

Social Work

Evaluating the effectiveness of the Kidogo model in empowering women and strengthening their capacities to engage in paid labor opportunities through the provision of quality childcare: a study protocol for an exploratory study in Nakuru County, Kenya

K. Okelo, M. Nampijja, et al.

Medicine and Health

Design and Analysis of a Deep Learning Ensemble Framework Model for the Detection of COVID-19 and Pneumonia Using Large-Scale CT Scan and X-ray Image Datasets

X. Xue, S. Chinnaperumal, et al.

Medicine and Health

Development of prediction models for screening depression and anxiety using smartphone and wearable-based digital phenotyping: protocol for the Smartphone and Wearable Assessment for Real-Time Screening of Depression and Anxiety (SWARTS-DA) observational study in Korea

Y. Shin, A. Y. Kim, et al.

Medicine and Health

A recursive bifurcation model for early forecasting of COVID-19 virus spread in South Korea and Germany

J. Shen

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny