Biology
Binary vector copy number engineering improves *Agrobacterium*-mediated transformation
M. J. Szarzanowicz, L. M. Waldburger, et al.
Agrobacterium-mediated transformation (AMT) is foundational for plant and fungal biotechnology, enabling transfer DNA (T-DNA) delivery into diverse eukaryotic cells via engineered binary vectors. Despite extensive optimization of virulence gene expression, induction conditions, and strain engineering, transformation efficiency remains a bottleneck for many species and genotypes. Binary vector backbones contain an origin of replication (ORI) that determines host range and copy number; prior inter-ORI comparisons suggested copy number influences AMT efficiency, but intrinsic ORI differences confounded interpretation and no systematic effort had altered copy number within the same ORI across broad-host-range plasmids. This study addresses the gap by creating and screening large libraries of copy-number variants for four commonly used ORIs (pVS1, RK2, pSa and BBR1) to dissect how ORI-specific copy number and bacterial physiology impact transient and stable AMT outcomes in plants and fungi.
Previous work highlighted that ORI identity and copy number impact AMT efficiency, with studies in maize and Arabidopsis reporting varied relationships depending on Agrobacterium strain, plant background and ORI (Zhi et al., Oltmanns et al.). Attempts to directly manipulate binary vector copy number have been limited; a pRi-derived higher-copy mutant did not improve stable transformation (Vaghchhipawala et al.). In bacterial synthetic biology, plasmid copy number strongly affects pathway flux and circuit behavior, leading to engineering of copy number in narrow-host-range origins like pMB1 and pSC101 and a few mutants in broad-host-range ORIs (e.g., RK2, BBR1, pVS1). Mechanistically, replication initiation proteins (RepA-like) bind oriV to regulate replication and copy number, with models such as RepA dimerization (“handcuffing”) negatively controlling replication. However, a general, high-throughput method to diversify copy number across multiple broad-host-range ORIs in Agrobacterium and to evaluate impacts on AMT had not been available.
- Library generation: The RepA (or RepA-like) open reading frame from four broad-host-range ORIs (pVS1, pSa, RK2, BBR1) was diversified by error-prone PCR, yielding ~100,000 independent mutants per ORI. Mutant ORFs were cloned into a selection vector containing a spectinomycin marker and a gentamicin resistance cassette driven by a salicylic-acid-inducible promoter.
- Growth-coupled selection: In Agrobacterium tumefaciens C58C1, a checkerboard assay varied gentamicin and salicylic acid concentrations to identify wild type (WT)-lethal conditions permitting mutant library growth, thereby enriching higher-copy-number variants (survival coupled to increased antibiotic tolerance). Selected populations and unselected controls were propagated in triplicate.
- Population sequencing: RepA loci (including flanks) from selected and unselected pools were tagmented and sequenced on Illumina MiSeq (2×300 bp). Read QC, trimming, alignment and mutation calling identified enriched nucleotide positions/residues conferring survival. Enriched residues were mapped to AlphaFold RepA models, revealing clustering at predicted dimerization interfaces.
- Constructing test vectors: Approximately 20 top-enriched amino acid substitutions per ORI were installed as single-nucleotide polymorphisms into otherwise identical binary vectors carrying a constitutive promoter-driven GFP (promoters: PCM2 for RK2 and pSa; pCL2 for pVS1 and BBR1 to avoid detector saturation). Whole-plasmid sequencing verified constructs.
- Transient plant assays: Agrobacterium EHA105 (and validation in GV3101) carrying each mutant or WT binary vector was infiltrated into Nicotiana benthamiana leaves; GFP fluorescence from leaf discs (n=48 per construct) quantified transient T-DNA delivery. Select constructs were also tested in Lactuca sativa with P19 co-infiltration.
- Copy number and growth measurements: Digital PCR quantified plasmid copy number as the KanR:rpoB ratio; growth rates were measured in LB and MOPS minimal + glucose in microplates, and relationships among copy number, growth rate and GFP output were analyzed by regression.
- Stable transformation: For Arabidopsis thaliana (GV3101), floral dip was performed with binary vectors carrying 35S::KanR (and a separate validation with hygromycin resistance + Ruby reporter). Transformation efficiency was calculated as recovered plants per seeds plated (~26,500–25,000 seeds). For Rhodosporidium toruloides (EHA105-mediated AMT), nourseothricin selection plates were used; colony counts per plate quantified transformation efficiency.
- Statistics: Transient assays used Tukey’s HSD to assess differences versus WT; regressions reported adjusted R^2 and F-test P values; Arabidopsis counts analyzed by Fisher’s exact test; yeast colony counts by Welch’s t-test.
- Directed evolution and selection enriched mutations across all four ORIs; enriched residues mapped predominantly to predicted RepA dimerization interfaces, supporting a handcuffing-based negative regulation model.
- Transient Nicotiana benthamiana expression (n=48 discs/construct) identified mutants that increased GFP output for all ORIs:
- pSa: largest gains; E90K showed 6.9-fold increase over WT; 13/19 mutants significantly outperformed WT; dynamic range spanned 28-fold.
- RK2: S20F showed 5.4-fold increase; six mutants exhibited very low GFP with slow growth, forming high/low groups.
- pVS1: R106H showed 2.1-fold increase (screened with weaker promoter to avoid saturation); WT already high.
- BBR1: most mutants reduced GFP; top mutant E182V had 1.7-fold increase.
- Copy number and growth relationships were ORI-specific (dPCR and growth assays):
- pSa: WT 4.5 copies/cell increased up to 49; copy number positively correlated with GFP without plateau; no significant relationship with growth rate.
- RK2 (mini-RK2 oriV + trfA): WT ~1.2 copies/cell (verified across conditions), increased up to 18; optimal GFP at intermediate 5–15 copies; excessive copy number reduced GFP. Growth rate rose with copy number up to ~7 copies then declined; GFP correlated with growth rate.
- pVS1: WT 9.5 copies/cell up to 66; optimal GFP at 30–40 copies; copy number negatively correlated with growth rate; GFP not correlated with growth rate (slow-growing high-copy mutants could outperform WT).
- BBR1: WT ~52 copies/cell; no relationship between copy number and GFP; faster bacterial growth associated with higher GFP; copy number not correlated with growth rate. Top mutant E182V had ~8 copies yet outperformed WT.
- Enrichment scores from the selection were generally poor predictors of downstream GFP, growth or copy number, except weak significant correlations in pSa.
- Stable transformation improvements:
- Arabidopsis thaliana: pVS1 R106H increased efficiency by 60% over WT (Fisher’s exact test P=2.97×10^-12). RK2 mutant improved by 2,800% and pSa by 280% (both significant; RK2 P<2.2×10^-16), though pVS1 had higher absolute efficiency. Additional pVS1 mutants improved by 35% and 24%. Validation with hygromycin + Ruby reporter showed 103% improvement for R106H over WT.
- Rhodosporidium toruloides: pVS1 R106H and RK2 S20F improved transformation by 390% and 510%, respectively (Welch’s t-test P=6.49×10^-7 for pVS1; P=0.0034 for RK2).
- Overall, increasing binary vector copy number improves AMT efficiency for three of four ORIs, but optimal copy number and the roles of bacterial growth differ by ORI. Engineering single RepA SNPs provides up to 28-fold tuning of transient expression.
The study directly links engineered binary vector copy number variants to AMT performance and reveals that the relationship is origin-specific and influenced by bacterial physiology. By enriching RepA mutations—often at dimerization interfaces—the work supports mechanistic models where reduced RepA self-association elevates replication and copy number. Transient screening in N. benthamiana effectively predicted improvements in stable transformation across kingdoms, including Arabidopsis and the yeast Rhodosporidium, demonstrating broad utility. However, excessive copy numbers can impose metabolic burden and reduce performance (observed for RK2 and pVS1), while for BBR1 copy number was not predictive and growth dominated outcomes. These findings emphasize that there is no universal rule for copy number optimization across ORIs; empirical testing is required to identify optimal variants for a given origin, strain and host context. The pipeline provides a generalizable framework for RepA-based ORIs, expanding a toolkit for precision control of gene delivery and expression in prokaryotes and for improving AMT efficiencies.
This work establishes a high-throughput, growth-coupled directed evolution and screening pipeline to generate and functionally evaluate copy number variants of broad-host-range binary vector ORIs. Single RepA SNPs can dramatically tune transient T-DNA delivery (up to 28-fold) and significantly enhance stable transformation in plants and fungi. Mechanistically, enriched mutations cluster at RepA dimerization interfaces, aligning with handcuffing models of copy number control. Practically, the resulting variant libraries enable fine control of T-DNA dosage to balance transformation efficiency and insertion quality, and may benefit gene editing by increasing repair template delivery for HDR. Future research should (i) stack mutations and explore alternative substitutions at enriched sites to expand the dynamic range, (ii) systematically map how copy number influences stable transformation efficiency and event quality (e.g., insertion copy number), and (iii) extend the pipeline to additional ORIs and host backgrounds to generalize optimization rules.
- ORI-specific outcomes limit generalizability; optimal copy number differs by origin and host context, and no single principle predicts AMT performance across ORIs.
- BBR1 deviated from the copy number–performance hypothesis, indicating other factors (e.g., growth, host interactions) can dominate.
- Selection enrichment scores only weakly predicted downstream traits (notably only in pSa), highlighting the need for empirical validation.
- Very high copy numbers can impose metabolic burden, reducing growth and AMT performance (observed in RK2 and pVS1).
- Copy number determinations for RK2 pertain to a mini-RK2 variant and differed from some literature values; results may depend on vector architecture and assay conditions.
- Work was conducted in selected Agrobacterium strains (EHA105, GV3101) and host systems; broader validation across species and genotypes is needed.
Related Publications
Explore these studies to deepen your understanding of the subject.

