Biology
A highly conserved and globally prevalent cryptic plasmid is among the most numerous mobile genetic elements in the human gut
E. C. Fogarty, M. S. Schechter, et al.
Discover the groundbreaking findings of a study that reveals pBI143, a cryptic plasmid outpacing crAssphage in the industrialized gut microbiome. Conducted by a team of researchers including Emily C Fogarty and Matthew S Schechter, this research highlights the potential of pBI143 in fecal contamination detection and colonic inflammation biomarker applications.
~3 min • Beginner • English
Introduction
The study investigates the ecology, evolution, host range, transmission, and functional impact of a small cryptic plasmid, pBI143, in the human gut. Cryptic plasmids typically lack obvious beneficial functions and are thought to be genetic parasites, yet their prevalence and dynamics in natural communities remain poorly understood. Building on metagenomic observations that pBI143 is widespread in human gut microbiomes and especially common in industrialized populations, the authors hypothesize that pBI143 is a highly conserved, abundant mobile genetic element shaped by strong purifying selection and ecological processes (e.g., priority effects), with implications for human-associated environmental monitoring and gut health biomarkers.
Literature Review
Background work establishes plasmids as key mediators of horizontal gene transfer, microbial evolution, and antibiotic resistance spread. Cryptic plasmids are generally small, multi-copy, and lack obvious selectable phenotypes, yet are present across diverse taxa. Prior to this work, pBI143 (a ~2.7 kb plasmid with mobA and repA) was mainly used as a component of E. coli–Bacteroides shuttle vectors, with little ecological data. Advances in shotgun metagenomics and plasmid prediction have expanded catalogs of gut plasmids, including identification of pBI143 as highly prevalent. The study contextualizes pBI143 relative to crAssphage (a benchmark abundant gut phage) and discusses the industrialization-associated distribution of Bacteroidales, potentially explaining global variation in pBI143 prevalence.
Methodology
- Datasets: Screened 2,137 individually assembled human gut metagenomes to characterize pBI143 diversity; expanded prevalence analysis to 4,516 human gut metagenomes from 23 countries. Also analyzed mother–infant cohorts (154 pairs, multiple countries), 717 gut bacterial isolate genomes (104 species, 54 genera), and non-human/environmental metagenomes (marine, pets, primates, sewage), plus qPCR assays on sewage and water samples.
- Bioinformatics: Performed assembly and read recruitment with anvi’o v7 workflow; quality filtering (illumina-utils), assembly (IDBA_UD), mapping (Bowtie2), and summarization (samtools, anvi’o). Presence was defined by detection ≥0.5 (fraction of reference covered). Compared pBI143 prevalence and relative abundance to crAssphage using read recruitment, accounting for genome size.
- Variant analysis: Calculated dN/dS for mobA and repA; focused SNV analyses on mobA (conserved across versions). Identified variable positions across metagenomes and compared to three manually assembled reference versions (versions 1–3). Assessed monoclonality via ‘departure from consensus’. Predicted MobA structure (AlphaFold2/ColabFold), mapped single-amino-acid variants (SAAVs) with anvi’o structure, and interpreted sites relative to oriT-binding via alignment to MobM (PDB 4LVI).
- Host range: Searched 717 isolate genomes for pBI143; phylogenies for plasmid (mobA, repA) and hosts (38 ribosomal proteins) were constructed (MUSCLE, trimAl, IQ-TREE). Demonstrated conjugative transfer by inserting tetQ into pBI143 (Phocaeicola vulgatus donor) and selecting transconjugants in Parabacteroides johnsonii and Bacteroides ovatus; confirmed by multiplex PCR.
- Vertical transmission and priority effects: Read recruitment to pBI143 in mother–infant metagenomes; constructed SNV-sharing networks (anvi’o variability profiles, Gephi). Quantified maintenance/replacement patterns over time.
- Fitness and maintenance: Constructed isogenic B. fragilis strains with/without pBI143; maintenance over passages in vitro; competition in gnotobiotic mice over 14 days (PCR genotyping of colonies).
- Copy number under stress: Quantified pBI143 copy number per cell by multiplex TaqMan qPCR (pBI143 and single-copy B. fragilis hsp gene) after graded oxygen exposure; assessed reversibility upon return to anaerobiosis.
- Clinical association (IBD): Computed approximate copy number ratio (ACNR) for pBI143 relative to unambiguous host coverage using SCG-based taxonomy/coverage in 3,070 healthy and 1,350 IBD metagenomes; modeled geometric mean differences (rigr).
- Environmental specificity and marker benchmarking: Designed pBI143 qPCR assay; compared amplification in sewage and water to established human fecal markers (HF183 and Lachno3). Surveyed non-human gut and human skin/oral metagenomes for specificity.
Key Findings
- Global prevalence and abundance: pBI143 detected in 3,295/4,516 (73%) human gut metagenomes overall, predominantly in industrialized populations (e.g., Japan 92% of 636 individuals; USA 86% of 154). Low prevalence in non-industrialized cohorts (Madagascar 0.8% of 112; Fiji 8.7% of 172).
- Relative abundance: Average reads recruited by pBI143 and crAssphage were 0.05% and 0.13%, respectively; adjusting for size (~36× larger crAssphage), pBI143 is on average 14× more numerous than crAssphage. In an extreme infant sample, pBI143 comprised 7.5% of reads (>54,000× coverage).
- Versions and diversity: Three highly similar versions of pBI143 differ mainly at repA (as low as 75% identity); mobA more conserved. Geographic skew: Version 1 dominates North America/Europe; Version 2 dominates parts of Asia; Version 3 is rare (~7.4%).
- Selection and mutational landscape: Strong purifying selection (mobA dN/dS=0.11; repA dN/dS=0.04). Across gut metagenomes, 83.2% of SNVs occur at positions variable among the three reference versions; 84.8% of metagenomes are within 2 nt of a reference version. Most SNVs within individuals are fixed (near-zero departure from consensus), indicating monoclonality. Sewage samples are polyclonal (on average 35 non-consensus SNVs; many novel SNVs).
- Structural insights: Only 21 prevalent SAAVs (>5% samples) in MobA, enriched near DNA-binding residues (e.g., L56, E49, A64), suggesting coevolution of oriT–MobA specificity; other prevalent variants cluster at residues potentially interacting with host conjugation machinery.
- Host range and transfer: pBI143 found in 82 isolates across 11 species in Bacteroides, Phocaeicola, and Parabacteroides; identical pBI143 sequences occurred in multiple species from the same individual. Engineered pBI143-tetQ transferred to recipients at 5×10^-7 and 3×10^-6 transconjugants/recipient.
- Specificity: Absent from non-human-associated metagenomes (marine, pets, primates), and rarely detected in human skin/oral microbiomes; present in sewage.
- Environmental marker performance: In water/sewage, pBI143 qPCR amplified in all 41 samples where HF183 and Lachno3 were detected and in 6 additional samples missed by those markers, indicating higher sensitivity for human fecal contamination detection.
- Vertical transmission and priority effects: Mother–infant SNV patterns cluster within families, supporting vertical transfer. Over the first year, 69% of infants maintained the maternal version; 21% acquired two versions; 7% showed ‘wilt’ (loss). No complete replacement of a maternal version observed.
- Fitness impact: In metagenomes, pBI143 and host coverages positively correlate (R^2=0.5; p<0.001). In gnotobiotic mouse competitions, no consistent long-term fitness advantage or detriment; maintenance in vitro over passages.
- Cargo acquisition: Ten metagenomes showed larger pBI143 variants carrying cargo genes (e.g., toxin–antitoxin, galacturonosidase, pentapeptide transferase, phosphatase, histidine kinase), suggesting a dynamic plasmid system.
- Stress response and disease association: Oxygen exposure increased pBI143 copy number per cell in culture; levels returned to baseline in anaerobiosis. In metagenomes, ACNR of pBI143 was 3.72× higher in IBD vs healthy (95% CI 2.66–5.20; p<10^-13), consistent across hosts.
Discussion
The findings establish pBI143 as a highly abundant, human gut-specific cryptic plasmid with global prevalence patterns tied to industrialization-linked Bacteroidales distributions. Its sequence is under strong purifying selection with a restricted mutational landscape, yet limited sets of adaptive variants localize to functional residues in MobA, indicating fine-tuned oriT binding and potential host machinery interactions. Within individuals, pBI143 is typically monoclonal, best explained by priority effects reinforced by frequent vertical transmission from mothers to infants. The plasmid exhibits a broad Bacteroidales host range and can transfer across species, but its native two-gene form does not confer a consistent fitness benefit in vivo; instead, it behaves as an efficient, largely parasitic element that can occasionally acquire beneficial cargo, blurring lines between parasitism and mutualism. Practically, pBI143 outperforms existing markers for detecting human fecal contamination and its copy number increases under stress, suggesting utility as a biomarker for gut stress states such as IBD. Collectively, the results broaden the concept of the human core microbiome to include prevalent mobile genetic elements.
Conclusion
This work reveals pBI143 as one of the most numerous, highly conserved, and human-specific mobile genetic elements in the gut. It is shaped by strong purifying selection, shows constrained and functionally localized diversity, is generally monoclonal per individual due to priority effects and vertical transmission, spans multiple Bacteroidales hosts, and can occasionally carry adaptive cargo. Key applications include: (1) a sensitive, specific marker for human fecal contamination; (2) a potential, low-cost biomarker for gut stress/inflammation; and (3) a natural shuttle backbone for microbial therapeutics in the gut. Future research should dissect the molecular mechanisms underlying priority effects and monoclonality, determine how and when cargo is acquired and maintained, refine in vivo copy-number estimation, examine causal links with inflammatory states, and survey other cryptic mobile elements that may constitute a core component of the human microbiome.
Limitations
- Prevalence and abundance rely on read recruitment thresholds that, while conservative, may miss low-level occurrences or misassign closely related elements; monoclonality estimates depend on coverage and variant-calling parameters.
- ACNR is an indirect, coverage-based proxy for plasmid copy number in metagenomes and depends on accurate host assignment using SCG taxonomy; analyses were restricted to metagenomes with unambiguous host signals and sufficient coverage.
- Experimental fitness assessments were limited in duration and host strain diversity; subtle or context-specific fitness effects could be missed.
- Environmental qPCR benchmarking used archived samples and may be affected by storage/degradation, despite controls.
- The study is a preprint and some cohorts/country-level comparisons may be confounded by differences in sampling, sequencing depth, and host population structure.
Related Publications
Explore these studies to deepen your understanding of the subject.

