Biology
Selection of a promiscuous minimalist cAMP phosphodiesterase from a library of de novo designed proteins
J. D. Schnettler, M. S. Wang, et al.
This groundbreaking research by J. David Schnettler and colleagues delves into how new enzyme functions can emerge from unevolved sequences. Utilizing ultrahigh-throughput droplet microfluidics, the study screens an impressive library of over one million proteins, uncovering that significant sequence changes can lead to the acquisition of activity in a newly characterized manganese-dependent metalloenzyme.
~3 min • Beginner • English
Introduction
Modern enzymes are large, highly ordered proteins that provide extraordinary rate enhancements, but how minimally active catalytic functions first emerged from unevolved sequences remains unclear. Random sequence libraries rarely yield stably folded, soluble proteins or small-molecule binders, and catalysis is even rarer. The authors hypothesize that functional enzymes can emerge from short, dynamic peptides as envisioned by Dayhoff, and that starting from a stable de novo scaffold combined with ultrahigh-throughput screening and metal cofactors could bridge the gap between inactive sequences and catalysis. The study aims to test whether a library of de novo designed 4-helix bundles can yield biologically relevant phosphoesterases, identify the sequence/structural features enabling activity, and characterize the resulting catalysts’ metal dependence, specificity, kinetics, and structural properties.
Literature Review
Prior work shows: (1) random-sequence proteins infrequently fold or bind ligands; ATP binders occur ~1 in 10^11 (Keefe & Szostak). (2) Catalytic promiscuity and recruitment facilitate evolution of new activities after gene duplication (Jensen; O'Brien & Herschlag). (3) Functional metagenomics can reveal catalysts for non-natural reactions (~1 in 10^10). (4) Selections from non-catalytic scaffolds are rare and usually require very large libraries (~10^12 via mRNA display). (5) Computational design can succeed but rarely matches natural enzyme efficiencies. Dayhoff hypothesized early proteins arose from short peptides assembled into larger folds; symmetry and duplication may have promoted functional emergence. Previous de novo 4-helix bundle catalysts (e.g., ATPase; enterobactin esterase) and designed metalloproteins exist, but often rely on predesigned metal-binding sites and achieve lower catalytic proficiencies than natural enzymes. The present work extends these by isolating a minimalist, metal-dependent phosphodiesterase from a de novo scaffold without a predesigned metal site.
Methodology
- Scaffold and library design: Started from S-824, a 102-residue, stably folded de novo 4-helix bundle (PDB 1P68) designed by binary patterning. Randomized apical loops and helix termini (degenerate codons NDT/VRC/RRC) to create an active-site cavity; library size ~1.7 × 10^6 variants.
- Microfluidic droplet screening (FADS): Expressed library in E. coli; encapsulated single cells with a fluorogenic substrate mixture of phosphodiesters and phosphotriesters plus divalent metal mix (MnCl2, ZnCl2, CaCl2) in 3 pL droplets. Cells lysed in-droplet. Incubated and sorted droplets dielectrophoretically by fluorescence at ~0.8 kHz. Round 1: 10.3 million droplets (~4.4 million clones) screened; selected top 0.2–0.5%. Round 2: 4.4 million droplets; selected top ~0.2% under stricter conditions. Hits recovered by de-emulsification, plasmid extraction, retransformation.
- Secondary screening: ~250 clones picked and assayed in microtiter plates for lysate activity with the substrate/metal mix.
- NGS analysis: Sequenced input and post-sort libraries (~1.4 million unique variants). Quantified frameshifts, premature stops, positional truncations (enrichment near positions 40 and 60). Custom Python analysis (DiMSum pipeline).
- Hit selection and controls: Chose one high-activity clone (mini-cAMPase) for detailed study. Controlled for endogenous E. coli contaminants by comparing metal/cofactor requirements, co-purification of activity, and activity changes upon sequence mutations; purified with and without His6 tags and via multiple methods.
- Enzyme purification: Expressed in BL21; His-tag affinity purification, SEC; maintained reducing conditions (DTT/TCEP); prepared apo-protein via EDTA.
- Activity assays and kinetics: For nitrophenyl substrates, monitored pNP release at 405 nm; for cyclic nucleotides and dA-P-dA, quantified products (AMP/GMP, adenine-containing products) by RP-HPLC (UV 254/260 nm) using external standards. Michaelis–Menten parameters determined at 25 °C in buffered conditions with Mn2+. Competition assays (bis-pNPP with varying cAMP) to test common active site and inhibition (Lineweaver–Burk, Ki determination). pH-rate profile measured to estimate apparent pKa.
- Metal dependence: Compared activity with Mn2+, Zn2+, Ca2+ by time-resolved HPLC of AMP formation; ITC attempted for Mn2+ binding (no measurable binding signal).
- Structural characterization: SEC for oligomeric state (dimeric under reducing conditions). LC/MS to confirm reduced Cys; oxidation with H2O2 to promote disulfide dimer formation and test activity. CD spectroscopy for secondary structure and thermal stability (melting curves with/without Mn2+); 1H NMR to assess structural order/dynamics.
- Mutagenesis: Alanine scanning (His and Cys candidates) to probe roles in catalysis/metal interaction; kinetic effects quantified for cAMP and bis-pNPP. Constructed variants to separate effects of substitutions vs truncation (Substituted-824 vs mini-cAMPase; Short-824 expression check).
- Computation: MD simulations (Amber18, 10 × 100 ns) for S-824 and mini-cAMPase dimer to assess dynamics (RMSF/RMSD). Structure prediction via AlphaFold2, ESMfold, and MultiSFold to explore possible topoisomers and conformational heterogeneity.
Key Findings
- High-throughput discovery: Screened 10.3 million droplets in round 1 (~4.4 million clones) and 4.4 million in round 2; progressive enrichment of active phosphoesterases.
- Truncation enrichment: NGS showed premature stop codons increased from 17% (input) to 27% (after sort 2), a 1.6-fold enrichment. Truncations enriched at positions ~40 and 60 (1.2- and 2.8-fold, respectively); 18% of sequences truncated at one of these positions after sort 2.
- Minimal enzyme identified: Selected a 59-residue helix–turn–helix protein (mini-cAMPase) arising from frameshift/truncation, forming a noncovalent dimer consistent with a 2×2-helix bundle; increased structural dynamics relative to S-824.
- Metal dependence: Activity strictly requires Mn2+; Zn2+ and Ca2+ did not support cAMP hydrolysis under assay conditions. Mn2+ did not notably alter thermal stability (CD), consistent with a catalytic role. Apparent pKa of activity ~7.8.
- Substrate scope and specificity: Catalyzes hydrolysis of phosphodiesters and phosphonates with a single negative charge and trigonal-bipyramidal transition states. Promiscuous activity observed with cAMP (preferred), cGMP, bis-pNPP, p-nitrophenyl methylphosphonate, p-nitrophenyl ethylphosphate, and dA–P–dA (DNAse model) to a lesser extent.
- Kinetics (Table 1, per dimer, with Mn2+):
• cAMP: kcat/KM ≈ 2.2 M^-1 s^-1; kcat/kun-cat ≈ 7 × 10^9; catalytic proficiency (kcat/KM)/kun-cat ≈ 7 × 10^13 M^-1.
• cGMP: kcat/KM ≈ 0.087 M^-1 s^-1; proficiency ≈ 3 × 10^13 M^-1.
• bis-pNPP: kcat/KM ≈ 0.31 M^-1 s^-1; proficiency ≈ 5 × 10^14 M^-1.
• pNP-methylphosphonate: kcat/KM ≈ 0.68 M^-1 s^-1; proficiency ≈ 4 × 10^10 M^-1.
• pNP-ethylphosphate: kcat/KM ≈ 0.039 M^-1 s^-1; proficiency ≈ 6 × 10^10 M^-1.
• dA–P–dA: kcat/KM ≈ 0.045 M^-1 s^-1; proficiency ≈ 6 × 10^13 M^-1.
- Competition and common active site: cAMP competitively inhibits bis-pNPP hydrolysis with Ki = 70 ± 8 µM, indicating a shared active site/metal center for different substrates.
- Structure–function insights:
• Dimerization: SEC indicates a dimer under reducing conditions; disulfide-linked dimer forms under oxidizing conditions but is inactive. Reduced vs oxidized states display different stability (oxidized Tm > 90 °C; reduced Tm ≈ 72 °C).
• Dynamics: CD and 1H NMR show mini-cAMPase is helical but less ordered than S-824; dynamic ensemble likely contributes to function.
• Mutagenesis: Alanine substitutions of candidate metal-interacting residues reduce activity; C57A lowers kcat ~6.5-fold for cAMP and ~3.7-fold for bis-pNPP (involved but not essential). A variant retaining substitutions without truncation (Substituted-824) is ~6-fold less active than mini-cAMPase, highlighting the contribution of truncation/dimerization.
- Evolutionary significance: A de novo, unevolved sequence gained substantial catalytic proficiency and rate acceleration for a difficult biological reaction (cAMP hydrolysis), approaching values of some natural enzymes and surpassing many designed catalysts, discovered from a relatively modest library via ultrahigh-throughput screening.
Discussion
The study directly addresses how catalytic function can emerge from unevolved sequences by demonstrating that a de novo 4-helix bundle scaffold, when diversified and screened under ultrahigh-throughput conditions with metal cofactors and multiple substrates, can yield an efficient minimalist phosphodiesterase. Enrichment of truncations indicates that functionality may arise through major structural departures from the parental fold—here, loss of two helices enabling a dynamic helix–turn–helix dimer with a shared active site. The requirement for Mn2+ and competitive inhibition between substrates support a metallo-catalytic mechanism akin to natural phosphodiesterases, albeit with a distinct metal preference. The catalyst’s promiscuity and dynamic structure align with the concept that early enzymes were small, flexible generalists, consistent with Dayhoff’s hypothesis that short peptide assemblies can evolve function. The throughput enabled sampling sufficient sequence diversity to discover rare solutions (including truncations) that escape the stable but functionally inert folding basin of the parent. Overall, these findings demonstrate a viable path from inactive de novo sequences to catalysts capable of processing biologically relevant, thermodynamically challenging substrates, highlighting the role of structural dynamism and oligomerization in early functional emergence.
Conclusion
This work identifies a 59-residue, manganese-dependent, dimeric de novo phosphodiesterase (mini-cAMPase) from a ~1.7-million-member library using microfluidic droplet FADS. Truncation-driven emergence of activity, Mn2+ dependence, promiscuous hydrolysis of phosphodiesters (notably cAMP), and substantial rate accelerations/catalytic proficiencies comparable to some natural enzymes underscore that catalysis can arise in small, dynamic proteins without evolutionary history. The approach—randomizing a stable de novo scaffold and screening with multiple substrates and cofactors—reveals unexpected solutions (e.g., truncation/dimerization) and supports Dayhoff’s model of early protein evolution. Future directions include: directed evolution to enhance kcat and specificity; structural elucidation of the active site, metal coordination, and topological isomers; exploration of metal preferences and pH dependence; generalization of the strategy to other reaction classes and scaffolds; and probing evolutionary trajectories from dynamic generalists to specialized enzymes.
Limitations
- Metal binding not quantifiable by ITC; the affinity/stoichiometry of Mn2+ binding remains undetermined.
- The active-site architecture and precise catalytic mechanism are not resolved at atomic detail; structural models suggest multiple topoisomers and significant conformational heterogeneity.
- Activity depends on reducing conditions; disulfide-linked dimers are inactive, complicating physiological relevance and mechanistic interpretation under oxidizing conditions.
- Although extensive controls argue against contamination, assays rely on recombinant expression in E. coli; absolute exclusion of trace endogenous activities is challenging.
- Catalytic efficiency (kcat and kcat/KM) remains far below specialized natural phosphodiesterases, indicating substantial headroom for improvement.
- MD simulations (100 ns) did not capture the slower timescale dynamics implied by NMR/CD; dynamic contributions to catalysis remain indirectly inferred.
- Initial screening used fluorogenic substrates and a metal mixture, which may bias discovery toward certain mechanisms or metal dependencies.
Related Publications
Explore these studies to deepen your understanding of the subject.

