Food Science and Technology
Microbiome-based environmental monitoring of a dairy processing facility highlights the challenges associated with low microbial-load samples
A. J. Mchugh, M. Yap, et al.
Dairy processing environments harbor microorganisms that can contaminate food before and during processing, including spoilage-associated and pathogenic species. Routine environmental monitoring typically relies on swabbing and agar plating, often using selective, phenotype-based assays that can yield high false positives and provide limited information about non-target species or overall community composition. DNA sequencing methods have been applied to dairy and environmental samples for taxonomic profiling and source tracking, revealing functional potential (including virulence and spoilage traits), but standard high-throughput metagenomics requires expensive instrumentation and expertise, limiting routine use in manufacturing. Portable sequencing devices (e.g., ONT MinION) offer potential for rapid, on-site analysis by less experienced personnel, and have been evaluated clinically against Illumina sequencing and culture-based methods, but not yet in food processing environmental monitoring. As a proof-of-concept, the authors first assessed MinION rapid sequencing for classifying a simple four-strain mock community of related, dairy-relevant spore-formers. They then compared ONT MinION and Illumina sequencing, alongside culture-based methods, to characterize microbiota from environmental swabs in a dairy processing facility. MinION-based approaches were broadly comparable to Illumina for species-level classification. However, high concentration and quality DNA input required for MinION was a significant limitation in this low-biomass setting, necessitating random amplification (MDA). Despite these constraints, the potential benefits of routine metagenomic monitoring in food processing environments were evident.
The introduction situates this work within several strands of prior research: (1) Limitations of culture-based monitoring in dairy processing environments, including reliance on selective media, potential for false positives, and lack of comprehensive community information. (2) Application of high-throughput metagenomic sequencing in dairy and environmental contexts for taxonomy, source tracking, and functional potential profiling, though adoption is hampered by cost, platform requirements, and bioinformatics expertise. (3) Emergence of portable long-read devices (ONT MinION) designed for ease-of-use, previously evaluated in clinical metagenomics to rapidly identify pathogens directly from samples and compared with Illumina sequencing outputs or culture-based analyses. Despite these advances, portable sequencing had not been systematically applied for environmental monitoring in food processing facilities. The study addresses this gap by benchmarking MinION against Illumina and culture-based methods in a real facility, and by examining biases introduced by pre-processing (MDA) and culturing.
Study design comprised two phases: (1) a mock community benchmark to evaluate ONT MinION’s taxonomic classification accuracy for related spore-formers; (2) environmental monitoring of a dairy processing pilot plant with parallel ONT and Illumina sequencing and culture-based analyses, including assessment of MDA and culture biases and extraction of metagenome-assembled genomes (MAGs). Mock community: Genomic DNA from four spore-forming strains (Bacillus cereus DSM 31/ATCC 14579, Bacillus thuringiensis DSM 2046/ATCC 10792, Bacillus licheniformis DSM 13/ATCC 14580, Geobacillus stearothermophilus DSM 458) was prepared, quantified (Qubit dsDNA HS), quality-checked (agarose gel), diluted, and pooled equimolarly. ONT MinION sequencing used 16S rRNA full-length amplicons (SQK-RAB204; 10 ng input) and rapid whole metagenome sequencing (WMGS; SQK-RAD004; 400 ng input) on FLO-MIN106 R9 flowcells with appropriate MinKNOW versions. 16S reads were re-basecalled with Albacore; WMGS reads were adaptor-trimmed (Porechop) and basecalled with Albacore. Bioinformatics for mock: 16S reads were aligned with BLASTn to SILVA 132 and taxonomically classified in MEGAN 6; WMGS reads used LAST against NR (Mar 2018) with MEGAN-LR LCA classification. De novo assembly of ONT WMGS reads used Canu (1.7; nanopore-raw), with MUMmer/dnadiff comparisons to reference genomes; coverage visualized with R packages ggbio and GenomicRanges. Environmental sampling: Eight facility locations were swabbed once per month in October, November, and December 2018, post-CIP and pre-processing: table, door, wall, gaskets/flow plate seals, external dryer balance tank, internal dryer balance tank, external evaporator surface, and evaporator-adjacent drain. Five sponges per site were pooled per sample; stomaching released liquid for DNA extraction (20 ml) and culturing (1 ml). Pellets were prepared, washed, and stored at −80°C. Negative controls included unused swabs processed identically and extraction kit blanks; a positive control was the mock community. Culture-based workflow: From stomacher liquid, spreads on BHI agar (triplicate) were incubated at 30°C for 48 h; serial dilutions plated to 10^−6. Heat-treated aliquots (80°C, 12 min) were plated for mesophilic and thermophilic spore counts at 30°C and 55°C (48 h). For metagenomic DNA from easily culturable microbes, colonies from one plate per sample (neat plating) were washed off, pelleted, and stored for DNA extraction. Distinct colony morphotypes were isolated for 16S Sanger sequencing (PCR with 27F/338R, cleanup, and sequencing). DNA extraction and MDA: DNA from environmental pellets and plate pellets was extracted with Qiagen PowerSoil Pro. For samples with insufficient DNA for ONT, multiple displacement amplification (MDA) was performed with REPLI-g Single Cell kit under manufacturer’s conditions, including matched negative controls (swab, extraction, MDA blank) and a mock positive control. Library preparation and sequencing: ONT MinION libraries (SQK-RBK004 rapid barcoding) were prepared from 36 MDA samples (3 flowcells, 12 samples each: 8 environmental, 3 negatives, 1 positive), quantified (Qubit BR/HS), cleaned with Ampure XP, and sequenced on FLO-MIN106 R9. Illumina NextSeq libraries (Nextera XT; 7 min tagmentation) were prepared from MDA (n=36), non-MDA metagenomic DNA (no pre-processing; NPP; n=33), and plate-derived DNA (n=24), size-checked (Bioanalyzer), quantified (Qubit), pooled equimolarly, and sequenced with a NextSeq High Output 300 cycles v2.5 kit. 16S Sanger sequencing used colony-PCR and standard cycling conditions, with BLASTn against the NCBI 16S rRNA database for genus identification. Bioinformatics for environmental data: ONT reads were basecalled and demultiplexed with Guppy, barcode assignments parsed with guppy_bcsplit, adaptors trimmed (Porechop), QC assessed (FastQC), converted to fasta (IDBA fq2fa), aligned with LAST to NR (Mar 2018), and classified with MEGAN-LR LCA. Illumina reads were converted (bcl2fastq), quality-filtered and trimmed (kneaddata with Trimmomatic), human and bovine reads removed (BMTagger), QC assessed (FastQC), converted to fasta, aligned with DIAMOND to NR, and classified in MEGAN 6. For comparison, Illumina data were also profiled with Kraken2/Bracken and MetaPhlAn2. Relative abundances were computed in R (ggplot2). Hybrid assemblies (Illumina MDA + ONT MDA) were generated with OPERA-MS; Illumina reads were mapped with Bowtie2, sorted (samtools), depths computed, and bins produced with MetaBAT2. MAG quality was assessed with CheckM; annotations generated with Prokka; taxonomic assignments were made with Kaiju on ORFs and with MEGAN-LR on whole bins. Diversity analyses used vegan in R (Shannon, Simpson alpha diversity; Bray–Curtis NMDS beta diversity). Statistical testing used pairwise Wilcoxon rank-sum tests with Benjamini–Hochberg correction to compare processing/sequencing groups (MDA MinION vs MDA NextSeq; MDA NextSeq vs NPP NextSeq; NPP NextSeq vs Plate NextSeq). Data deposition: ENA PRJEB39267.
Mock community (ONT): 16S rRNA amplicon sequencing (SQK-RAB204) produced 996,441 reads totaling 1,454,835,092 bases (average read length ~1,460 bp; median 1,561 bp). BLASTn to SILVA 132 with MEGAN classified 3 of 4 species to species level; Geobacillus stearothermophilus DSM 458 was identified only to genus level. Rapid WMGS (SQK-RAD004) yielded 97,503 reads totaling 750,359,905 bases (average 7,696 bp; median 5,762 bp). MEGAN-LR LCA after LAST-to-NR classified 74.76% of bases; of classified, 42.63% to species, 46.28% to species group, and 8.15% to genus. Species-level assignments among classified bases: Bacillus thuringiensis 57.26%, Bacillus licheniformis 14.74%, Bacillus cereus 13.98%, Geobacillus stearothermophilus 13.8%, with 0.21% misassigned to Bacillus paralicheniformis. De novo assembly (Canu) produced 104 contigs; 9 of 10 reference replicons (4 chromosomes + 6 plasmids) were recovered (missing B. cereus plasmid pBClin15, 15 kb). Overall, 99.59% of assembled bases aligned to references and 98.27% of reference genomes aligned to assembled sequences. Environmental sequencing output: ONT MinION on MDA-amplified environmental DNA (36 samples) produced 899,306 reads totaling 1,648,724,928 bases (average read length 1,833 bp; median 926 bp; average ~24,981 reads and ~45.8 Mb per sample). MEGAN-LR classified 62% of bases; within classified, 29.11% to species and 38.36% to genus (67.47% of classified reads accounted for). Fifty-nine species exceeded 5% relative abundance in at least one ONT sample. Illumina NextSeq (93 samples) generated 734,909,370 reads (150 bp). DIAMOND/MEGAN classified 78% of reads to some level; within these, 10.8% at species and 39.6% at genus (50.3% of classified bases). Alternative classifiers mischaracterized the mock: Kraken2/Bracken classified 61% of reads with 99% at species level but did not correctly resolve the mock composition; MetaPhlAn2 also misclassified at least one species and could not resolve G. stearothermophilus beyond genus. Species-level agreement and dominant taxa: Using MEGAN (which correctly classified the mock), 108 species exceeded 5% in at least one sample across ONT and Illumina datasets, with broadly consistent patterns between paired samples. Kocuria sp. WRN011 was most abundant across multiple sites and time points, particularly in evaporator drain samples. Kocuria sp. ZOR0020 was abundant on external dryer balance tank surfaces. Other dominant species included Acinetobacter johnsonii (gaskets/flow plate seals), Micrococcus luteus (evaporator drain), Enterococcus faecium (inside dryer balance tank and other samples), Klebsiella pneumoniae (many December samples), and Enterococcus casseliflavus (many October/November samples). Exiguobacterium sibiricum was high in October/November door samples and present at lower abundance elsewhere. Platform-dependent differences: Exiguobacterium sp. S3.2 and Pseudochrobactrum sp. B5 were higher in October/November Illumina MDA vs ONT; Enterobacter sp. HK169 was detected in December ONT but not Illumina. Negative controls: Several taxa were specific to negatives (e.g., Kribbia dieselivorans, Cytophagales bacterium B6 in ONT MDA negatives; Paenibacillus fonticola in ONT and Illumina MDA negatives). Elevated Escherichia coli and Salmonella enterica were seen in MDA negatives (December), and Ralstonia insidiosa >0.2% only in negatives. Overlap between negatives and low-biomass environmental samples (e.g., internal dryer balance tank) highlighted contamination/cross-over risk and the necessity of controls. MAGs: Hybrid assemblies yielded 162 bins; 10 high-quality MAGs (>80% completeness, <10% contamination). Environmental MAGs included genus-level Planococcus (October evaporator drain), Exiguobacterium (October gasket/flow plate seal), Kocuria (October external dryer balance tank), and species-level Enterococcus casseliflavus (October table), Paracoccus chinensis (November evaporator drain), Macrococcus caseolyticus (November gasket/flow plate seal), and Nesterenkonia massiliensis (November external dryer balance tank). Three MAGs corresponded to mock control species. MDA and culture biases: MDA affected community profiles. Compared to NPP Illumina samples, MDA Illumina showed higher Exiguobacterium sibiricum and generally higher diversity; NPP samples were less diverse. Culture-derived (Plate) metagenomes were significantly less diverse than culture-independent samples but sometimes clustered with them. Culturing enriched some low-abundance taxa (e.g., Planococcus massiliensis, Microbacterium oxydans, Acinetobacter baumannii, Lysinibacillus sp. B2A1). Diversity and statistical differences: Alpha diversity (Shannon, Simpson) was lower in Plate samples; beta diversity (Bray–Curtis NMDS) showed some clustering between cultured and non-cultured samples. Across environmental samples (excluding controls), only 6 of 108 species differed significantly by processing/sequencing: Enterococcus casseliflavus, Acinetobacter lwoffii, and Acinetobacter johnsonii were higher in MDA MinION vs MDA NextSeq; Kocuria sp. WRN011 was higher in MDA NextSeq vs MDA MinION; Pseudochrobactrum sp. B5 was higher in NPP NextSeq vs MDA NextSeq; Exiguobacterium sibiricum was higher in MDA NextSeq vs NPP NextSeq. At genus level, 24 of 46 genera differed significantly among pairwise method comparisons; Pseudochrobactrum differed across all three pairwise comparisons; Exiguobacterium and Planococcus differed between MDA MinION vs MDA NextSeq and MDA NextSeq vs NPP NextSeq; Bacillus, Staphylococcus, and Ochrobactrum differed between MDA NextSeq vs NPP NextSeq and NPP NextSeq vs Plate NextSeq.
The ONT MinION demonstrated strong performance for species-level classification in the mock community, with rapid WMGS outperforming ONT 16S for distinguishing closely related Bacillus/Geobacillus species. In real environmental samples, ONT MinION and Illumina NextSeq produced broadly comparable species-level taxonomic profiles when analyzed with consistent MEGAN-based pipelines, underscoring the feasibility of portable sequencing for environmental monitoring. However, low biomass in post-CIP dairy environments imposed practical constraints, notably the need for high-quality, high-quantity DNA for ONT rapid library preparations. To achieve sufficient input, multiple displacement amplification (MDA) was required, which introduced amplification biases (e.g., overestimation of Exiguobacterium sibiricum) and potentially impacted read lengths and yields (including flowcell saturation during multiplexing). Cross-over/contamination indicated by overlap between negative controls and low-biomass samples further complicates interpretation, emphasizing the critical importance of rigorous controls at swabbing, extraction, amplification, and sequencing stages. Culture-dependent approaches showed clear selection biases and reduced diversity relative to culture-independent methods but can enrich specific taxa poorly represented in metagenomes, complementing sequencing-based monitoring. The successful recovery of high-quality MAGs from combined ONT and Illumina MDA data highlights the potential of hybrid assemblies to refine taxonomic resolution and reveal genomes of understudied environmental taxa; difficulties assigning some MAGs to known species suggest the presence of related, possibly unclassified species in these environments. Overall, while methodological choices (MDA, sequencing platform, culturing) significantly influence detected community structure at both species and genus levels, the convergent detection of key taxa across methods supports the utility of metagenomics for routine monitoring. Continued improvements in ONT accuracy and simplified workflows, coupled with amplification-free protocols and deeper sequencing, could further enable on-site, real-time monitoring in food processing settings.
This study establishes that portable ONT MinION sequencing can provide species-level classifications comparable to Illumina for both simple mock communities and real-world dairy processing environmental samples, particularly when consistent MEGAN-based taxonomic pipelines are used. However, the low microbial loads typical of post-CIP environments necessitated MDA to meet ONT input requirements, introducing amplification biases and complicating downstream interpretation. Culture-based methods, while valuable for enrichment of certain organisms, exhibited strong selection biases and reduced diversity relative to culture-independent approaches. The generation of high-quality MAGs from hybrid ONT–Illumina data demonstrates the feasibility of reconstructing genomes of environmental taxa, including potentially unclassified species. Future work should focus on developing amplification-free rapid workflows for low-biomass samples, enhancing contamination control and negative control design, expanding longitudinal and multi-site datasets for generalizability, and increasing sequencing depth to improve MAG recovery and strain-level resolution. These advances would accelerate the routine implementation of metagenomic monitoring across the food chain.
Key limitations include: (1) Low biomass of post-CIP environmental samples necessitated high DNA inputs for ONT, requiring MDA, which introduced amplification biases (e.g., overrepresentation of Exiguobacterium sibiricum) and affected read characteristics. (2) Evidence of cross-over/contamination, with notable overlap between negative controls and low-biomass environmental samples, complicates interpretation. (3) Culture-based analyses were performed under specific conditions (BHI, 30°C/55°C), leading to selection biases and reduced diversity. (4) ONT platform constraints at the time (input requirements, evolving accuracy and back-compatibility of kits/software) limited straightforward routine deployment. (5) Study scope was limited to a single facility and three monthly time points, potentially affecting generalizability. (6) Some taxa could not be resolved to species level and some classifier tools (Kraken2/Bracken, MetaPhlAn2) misclassified the mock community, underscoring taxonomy assignment challenges. (7) Multiplexing lower-quality environmental DNA led to flowcell saturation and lower ONT output than high-quality mock runs.
Related Publications
Explore these studies to deepen your understanding of the subject.

