Veterinary Science
Genomic analysis of Shiga toxin-producing *Escherichia coli* O157:H7 from cattle and pork-production related environments
P. Zhang, S. Essendoubi, et al.
This fascinating study reveals the genetic connections of *E. coli* O157:H7 strains linked to contaminated pork from pigs and cattle in Alberta, Canada. The research, conducted by Peipei Zhang and colleagues, uncovers insights into the origins of recent outbreaks, highlighting pigs as a significant source of this dangerous pathogen.
~3 min • Beginner • English
Introduction
Shiga toxin-producing Escherichia coli (STEC) cause diseases ranging from mild diarrhea to hemolytic uremic syndrome (HUS). In North America, O157:H7/NM is the predominant STEC serotype associated with severe outcomes. Cattle are recognized reservoirs of STEC O157:H7, while swine are not traditionally considered reservoirs; however, Alberta, Canada, experienced three pork-associated O157:H7 outbreaks (2014, 2016, 2018). Prevalence studies generally report low detection in pigs, though a recent Alberta survey found 1.4% of pigs and 1.8% of pork carcasses positive, and outbreak investigations recovered O157:H7 from pig feces. With whole-genome sequencing (WGS) now standard for subtyping, this study aims to determine the population structure and phylogenetic relatedness of 121 STEC O157:H7 isolates from pigs, cattle, and pork-production environments in Alberta using both conventional and WGS-based methods, and to characterize virulence and antibiotic resistance gene profiles for source attribution.
Literature Review
Prior work shows O157:H7 is frequently linked to beef and cattle reservoirs, while swine carriage is debated. Most studies worldwide did not detect O157:H7 in healthy pigs, though low prevalence (≤2%) has been reported in Japan, Ireland, UK, Sweden, Norway, USA, and Canada. Recent Alberta data reported low but measurable prevalence in pigs and carcasses. Previous typing suggested geographic structuring of O157:H7 lineages and clades, with clade 2 predominant in Canada. WGS-based subtyping (SNP, cg/wgMLST, gene content) offers improved resolution over conventional methods and has been effective in outbreak investigations. The literature also notes variability in stx subtype distributions and the clinical importance of stx2a, as well as documented plasmid incompatibility groups associated with O157:H7.
Methodology
Study design and isolates: 121 STEC O157:H7 isolates from Alberta, Canada: pigs (n=41; fecal), pork-production environments (n=30; retailers, processing facilities, barns/lagoons; including outbreak-related isolates from 2014, 2016, 2018), and cattle (n=50; fecal from feedlots and slaughter plants, 2002–2015). Pig/environment isolates originated from 11 farms (1–11) and four plants (A–D) collected 2014–2018. Genomes of most isolates were sequenced by the authors; additional cattle genomes and raw reads were obtained from NCBI and assembled as needed.
Conventional subtyping in silico: Manning clade assignment via SNPs in ECs2357, ECs2521, ECs3881, ECs4130; LSPA lineage by in silico PCR (Yang et al. primers); MLST by mlst tool using Achtman’s 7-gene and Pasteur’s 8-gene schemes; Clermont phylo-grouping using arpA, chuA, yjaA, and TspE4.C2.
Whole-genome SNP analysis: Snippy 4.4.0 aligning to EDL933 reference (chromosome CP008957.1; plasmid CP008958.1). For isolate genomes available only as assemblies, contigs were shredded (-contigs) to call SNPs. Recombination masked with Gubbins. Pairwise SNP distances computed with snp-dists. Maximum likelihood core SNP tree constructed with RAxML v8 (GTRGAMMA; 100 bootstraps). Trees visualized with ITOL and ggtree.
cgMLST/wgMLST: chewBBACA using EnteroBase Escherichia schemes curated to include 2513 cgMLST and 24,952 wgMLST loci meeting coding criteria. Allele calling by chewBBACA; neighbor-joining trees built with GrapeTree ignoring missing alleles in pairwise comparisons.
Pan-genome and gene content: Genomes annotated with Prokka v1.13.7. Roary used to compute pan-genome at 90% protein identity. Presence/absence matrices converted to Jaccard distances; neighbor-joining gene-content tree built using phangorn in R.
Comparative dendrogram analysis: UPGMA dendrograms generated from distance matrices (SNP, cgMLST, wgMLST, gene content) with hclust in R; tanglegrams for visual comparison using dendextend; pairwise cophenetic correlation coefficients (CPCC) computed with permutation tests for significance.
Virulence and resistance gene detection: stx subtypes via CGE VirulenceFinder (≥90% identity, ≥80% coverage). Other virulence genes screened using ecoli_vf (PHAC-NML) and CGE virulence database with ABRicate (≥80% coverage, ≥90% identity). Antibiotic resistance genes via ResFinder; plasmid replicons via PlasmidFinder. Thresholds per tool defaults unless stated.
Data availability: Sequencing data and supplementary materials provided or referenced.
Key Findings
Conventional subtyping: 95% (115/121) were Manning clade 2; 115/121 LSPA lineage I (with some cattle isolates showing lineage diversity, including lineage II and I/II; Cat24 lacked rbsB preventing LSPA subtyping). Pasteur 8-gene MLST: most isolates ST 822; Cat02–Cat04, Cat13, Cat46 were ST 628; Env06 and Cat12 had unique alleles (polB and icdA) not matching known types. Achtman 7-gene MLST: all ST 11. Clermont phylo-typing: all group E. Overall, conventional methods showed limited diversity, with cattle isolates slightly more diverse.
WGS distances and tree concordance: Across 121 isolates, pairwise distances were 0–770 SNPs; 0–297 cgMLST allele differences; 0–669 wgMLST allele differences; Jaccard distances 0.012–0.222. UPGMA dendrograms from SNP, cgMLST, and wgMLST were highly correlated (CPCC 0.995–0.996; P<0.01). Gene content dendrogram was moderately correlated with core-genome methods (CPCC 0.602–0.617), reflecting accessory genome variation.
Clustering and relatedness: Core SNP, cgMLST, and wgMLST trees grouped pig and pork-environment isolates into seven main clusters (groups 1–7). Gene content analysis split groups 1 and 4 into subgroups, suggesting gene gain/loss within those groups. Using ≤20 SNPs as closely related, ≤100 SNPs as possibly related: most isolates within each group were closely related. Environmental isolates in groups 1–5 were closer to pig isolates (closest pig distances: 0–7, 7–8, 9–11, 0–46, and 1–17 SNPs) than to cattle (respective closest cattle distances: ≥25, 55, 23, 45, 45 SNPs). Environmental isolates in groups 6–7 were ≥84 SNPs from either pig or cattle isolates in this dataset. All pig and environmental isolates were within ≤100 SNPs of at least one cattle isolate, indicating recent common ancestry across hosts.
Outbreak and farm associations: Clonal or near-clonal pig isolates were repeatedly recovered over two years on farms 1 and 4 and frequently on farm 9. The 2018 outbreak environmental isolates (Env02–Env05) were ≤10 SNPs from pig isolates on farm 9, supporting a direct epidemiological link. The 2014 (Env15, Env26–Env30) and 2016 (Env01) outbreak isolates were more distantly related to pig and cattle isolates (2014: ≥93 SNPs to nearest pig, ≥91 to cattle; 2016: ≥38 to pig, ≥49 to cattle).
Pan-genome: 8096 genes identified; 4052 core (50.05%) and 4044 accessory (49.95%). Among classified accessory genes, mobilome (COG category X) constituted 38.9%, indicating strong prophage/transposon contributions to genome plasticity.
Virulence gene profiles: stx subtypes 1a, 2a, and 2c detected with source-specific distributions. Pigs: 61% (25/41) stx2a only; 39% (16/41) stx1a+stx2a. Environmental: 33.3% (10/30) stx2a only; 66.7% (20/30) stx1a+stx2a. Cattle: 70% (35/50) stx1a+stx2a; 10% (5/50) stx1a only; 10% (5/50) stx2a only; 8% (4/50) stx1a+stx2c; 2% (1/50) stx2c only. stx patterns correlated with phylogenetic groups. eae was present in all isolates except Cat48. paa and toxB were absent only in Cat13; ehaA present in all. saa and lpfA absent in all genomes. Potential adhesion genes c3610, cah, and iha were absent in all group 2 isolates but present in most others. ehxA present in all but Cat45. astA present in all; estlA and subA absent in all.
Antibiotic resistance: 18.2% (22/121) predicted resistant to at least one antibiotic class (aminoglycosides, beta-lactams, phenicols, sulfonamides, tetracyclines, and/or trimethoprim). 2014-outbreak (group 6) isolates carried aminoglycoside, phenicol, and sulfonamide resistance; 2016 and 2018 outbreak isolates had no predicted resistance. Groups 2 and 7 exhibited resistance to four and three classes, respectively, while groups 1, 3, 4, and 5 had no predicted resistance. Only 1/41 pig isolates harbored resistance genes; 8/50 cattle isolates did.
Plasmids: Eleven plasmid replicon variants were detected, notably Col(BS512), Col156, IncFIA, IncFIB(AP001918), IncFII, IncFII(pHN7A8), IncFII(pCoo), IncI2(Delta), IncI1-I(Gamma), IncN, and pEC4115. All isolates carried IncFIB(AP001918) and IncFII replicons (pO157-like) with ehxA and etpC–O present. Additional plasmids were more common in cattle (26%, 13/50) than in pigs (4.9%, 2/41). Specific group-associated replicons included Col(BS512), IncI1-I(Gamma), and IncFII(pCoo). Notable resistance-bearing plasmids: IncN in Env18 carrying aph(3")-Ib, aph(6)-Id, blaTEM-1B, sul2, tet(A); IncFII(pCoo) in group 6 carrying aadA2b/aadA1, cmlA1, sul3; IncFII(pHN7A8) in some cattle isolates carrying aminoglycoside genes (aph) with/without sul2.
Discussion
The study addressed whether pigs in Alberta constitute a significant source of STEC O157:H7 contaminating pork-production environments and how pig-derived isolates relate phylogenetically to cattle isolates. Despite low overall diversity by conventional subtyping, WGS revealed seven related groups encompassing most pig and pork-environment isolates. Environmental isolates in groups 1–5 were consistently closer to pig isolates than to cattle isolates, and the 2018 outbreak was tightly linked (≤10 SNPs) to a pig farm (farm 9), supporting source attribution to pigs and indicating persistence of particular strains on farms over time. Nonetheless, all pig and environmental isolates were within ≤100 SNPs of at least one cattle isolate, implying recent common ancestry and potential interspecies transmission or shared upstream sources. Differences between core-genome trees and gene content trees highlight accessory genome dynamics (prophages, transposons, plasmids) as drivers of diversification, possibly supporting adaptation to specific niches (e.g., pig gut). Distinct stx distributions (pigs enriched for stx2a-only versus cattle enriched for stx1a+stx2a) and differing antibiotic resistance and plasmid profiles further distinguish host-associated populations and have implications for virulence and public health risk. Collectively, these findings support that pigs can be important sources of STEC O157:H7 contamination in pork and related environments, with some strains persisting on farms and contributing to outbreaks.
Conclusion
WGS-based subtyping of 121 STEC O157:H7 isolates from Alberta demonstrated seven closely related groups that predominantly comprise pig and pork-environment isolates. Environmental isolates often clustered more closely with pig than cattle isolates, and a direct genomic link was shown between the 2018 pork-associated outbreak and a specific pig farm, indicating on-farm persistence and transmission into the pork chain. While cattle and pig isolates share recent common ancestry, host-associated differences were evident in stx profiles, antibiotic resistance, and plasmid content. The study underscores the value of integrating WGS with conventional subtyping for surveillance and source attribution and highlights pigs as a significant source for STEC O157:H7 in pork-production environments. Future work should include longitudinal farm-level studies to understand persistence and transmission dynamics, enhanced integrated surveillance across animal species and processing environments, and characterization of accessory genome elements (phages, plasmids) that influence virulence and resistance.
Limitations
Related Publications
Explore these studies to deepen your understanding of the subject.

