Introduction
Precise single-nucleotide alteration is crucial for gene editing in biological research and therapeutics. Cytosine base editors (CBEs) and adenine base editors (ABEs) enable C-to-T or A-to-G conversions, respectively. However, they cannot perform C-to-G or A-to-T transversions, which are needed to correct approximately 40% of human pathogenic point mutations. Recent studies demonstrated that replacing the uracil-DNA glycosylase inhibitor (UGI) in a CBE with uracil-DNA glycosylase (UNG) allows C-to-G transversion, but efficiency was limited at specific sites, lacking general rules. This research aimed to improve C-to-G transversion efficiency by optimizing CBE design. The study focuses on enhancing C-to-G base editing by optimizing the design of CBEs. By changing the species origin and relative position of UNG and deaminase, and optimizing codons, the researchers sought to create highly efficient and precise C-to-G base editors.
Literature Review
Existing cytosine base editors (CBEs) and adenine base editors (ABEs) efficiently convert C-to-T or A-to-G, but lack the ability to perform C-to-G or A-to-T transversions, which are necessary to correct a substantial portion (40%) of human pathogenic point mutations. Recent research has explored using uracil-DNA glycosylase (UNG) in place of uracil-DNA glycosylase inhibitor (UGI) within CBEs to enable C-to-G conversions. However, these initial C-to-G base editors (CGBEs) showed high efficiency only at a limited number of target sites, lacking comprehensive predictive rules for efficient editing. This study builds upon these findings, aiming to enhance the efficiency and predictability of C-to-G base editing.
Methodology
The researchers started by comparing the efficiency of C-to-G base editing using UNGs from various species (human, *E. coli*, mouse, *C. elegans*) to replace UGI in BE3. They identified *E. coli* and *C. elegans* UNGs as superior. To reduce off-target effects, mutations W90Y and R126E (YE1) were introduced into the rAPOBEC1 module of CGBEs. Further optimization included adding a nuclear location signal peptide and codon optimization for human cells. The position of the UNG domain was also modified, fusing it to the N-terminus rather than the C-terminus, resulting in improved variants: FNLS-eUNG-YE1-CGBE (eOPTI-CGBE) and FNLS-CUNG-YE1-CGBE (cOPTI-CGBE). The editing efficiency and purity of these optimized CGBEs were evaluated at 34 endogenous sites in HEK293T cells using GOTI and RNA-seq to assess off-target effects. A sgRNA library (41,388 sequences) was used to determine motif preferences and build a deep-learning model (CGBE-SMART) to predict editing outcomes. The model's accuracy was tested using an independent test dataset. Finally, the efficiency of the optimized CGBEs was evaluated in mouse embryos, with phenotypic analysis of Tyr-edited offspring. In addition to the above, methods involved plasmid construction and cloning, cell culture, transfection, and FACS, lentivirus production and transduction, in vitro transcription of mRNA and sgRNA, zygote or two-cell injection and embryo transplantation, target sequencing of endogenous sites, whole-genome sequencing (WGS) and RNA-seq, and statistical analysis. The CGBE-SMART model was implemented using python and pytorch, leveraging deep-learning techniques to predict base editing efficiency.
Key Findings
The optimized CGBEs (eOPTI-CGBE and cOPTI-CGBE) demonstrated significantly higher C-to-G transversion efficiency than previous CGBEs and prime editors. The off-target effects, both on DNA and RNA levels, were substantially reduced compared to BE3. Motif analysis revealed a preference for "WCW" (W = A or T) motifs for eOPTI-CGBE and cOPTI-CGBE. Other deaminases (mutated human APOBEC3A, APOBEC3G variants) showed preferences for "TCW" and "CCN" motifs, respectively. A deep-learning model, CGBE-SMART, was developed and demonstrated high accuracy in predicting C-to-G editing efficiency and outcomes based on sequence context, exceeding the performance of existing models (BE-Hive and DeepCBE) in most cases. In mouse embryo experiments, OPTI-CGBEs achieved high C-to-G editing efficiency, especially when injected into two-cell stage embryos. Tyr gene editing in mouse embryos resulted in observable phenotypic changes in hair color, confirming the effectiveness of OPTI-CGBEs in vivo. The CGBE-SMART model showed high predictive accuracy for both exogenous (artificial library) and endogenous (natural genomic) target sites, with correlation coefficients ranging from 0.47 to 0.60. This indicates the model's robustness and generalizability.
Discussion
This study successfully engineered highly efficient and precise C-to-G base editors with significantly reduced off-target effects. The identification of sequence context preferences for different deaminases provides valuable insights into the mechanisms of C-to-G base editing. The development of CGBE-SMART, a highly accurate deep-learning model, enables precise prediction of editing outcomes based on sequence context, improving the efficiency and design of future gene editing experiments. The successful application of OPTI-CGBEs in mouse embryos demonstrates their potential for generating genome-edited animal models. However, the study acknowledges that factors beyond sequence context, such as epigenetic regulation and chromatin accessibility, might also influence editing efficiency. Future research should investigate these factors to further enhance the predictability and efficiency of C-to-G base editing.
Conclusion
This research presents optimized C-to-G base editors (OPTI-CGBEs) with high efficiency, low off-target effects, and predictable editing outcomes based on sequence context. The development of the CGBE-SMART prediction model significantly aids in the design of efficient gene-editing experiments. The successful application in mouse embryos highlights the potential of OPTI-CGBEs for generating genome-edited animal models. Future research could focus on further refining the predictive model by incorporating additional factors, such as epigenetic modifications, to enhance its accuracy and broaden its applicability.
Limitations
While CGBE-SMART demonstrated high predictive accuracy, it's important to acknowledge that *in vivo* factors like epigenetic regulation, chromatin accessibility, and DNA repair pathways can influence editing efficiency. The model's accuracy might be affected by variations in experimental conditions. The study focused on specific deaminases and might not fully capture the complexity of all possible sequence context effects on C-to-G base editing. The mouse embryo studies primarily utilized the Tyr gene; broader applications to other genes require further investigation.
Related Publications
Explore these studies to deepen your understanding of the subject.