Introduction
Messenger RNA (mRNA) is a versatile tool with applications in vaccines, protein replacement therapy, and gene editing. However, mRNA's instability necessitates effective delivery systems, primarily ionizable lipid nanoparticles (LNPs). The success of LNP-based mRNA vaccines against SARS-CoV-2 underscores the importance of this technology. LNPs consist of ionizable lipids, cholesterol, helper lipids, and PEGylated lipids; the ionizable lipid plays a pivotal role in mRNA encapsulation and intracellular release. The need for diverse LNPs tailored to various cell types and tissues is growing as mRNA therapy expands beyond vaccines. While combinatorial chemistry allows high-throughput synthesis of lipid libraries, screening these libraries remains challenging and time-consuming. Deep learning offers a promising solution to explore the vast chemical space and predict the efficacy of different lipid structures. This study introduces the AI-Guided Ionizable Lipid Engineering (AGILE) platform, which leverages deep learning to significantly accelerate the development of ionizable lipids for mRNA delivery.
Literature Review
Previous research has explored the rational design of ionizable lipids for improved mRNA delivery. However, these studies often cover limited structural space. Combinatorial chemistry using multi-component reactions, such as the Ugi reaction, has been employed to synthesize diverse lipid libraries for high-throughput screening (HTS). For example, Ugi-based three-component reactions (3-CR) have been used to create libraries leading to the identification of lipids suitable for mRNA vaccine delivery and for efficient mRNA delivery to the lung epithelium. Despite the success of combinatorial chemistry, creating and testing extensive libraries (hundreds of thousands of compounds) remains a major hurdle. Deep learning, a subset of artificial intelligence (AI), has emerged as a powerful tool for exploring molecular search spaces. It can extract insights from molecular structures and properties to predict the efficacy of novel compounds, transforming chemical discovery from a trial-and-error process into an intelligent, data-driven strategy. Many studies have demonstrated the potential of deep learning in various aspects of drug discovery, including predicting the properties and activities of novel compounds and guiding the design of novel molecules with desired properties.
Methodology
The AGILE platform synergistically integrates deep learning and high-throughput combinatorial lipid synthesis. It comprises three stages: (1) virtual library development and self-supervised model training; (2) model refinement through supervised fine-tuning with experimental data; and (3) in silico analysis of a candidate library. Stage 1 involves creating a virtual library of chemically diverse lipids (60,000) and pre-training a graph neural network (GNN) using contrastive learning. The GNN, initialized with parameters from a pre-trained model (MoICLR), learns to represent lipid structures effectively. Stage 2 involves synthesizing an experimental library of 1200 lipids using a one-pot Ugi 3-CR method and evaluating their mRNA transfection potency (mTP) in HeLa cells. The mTP data are used to fine-tune the GNN model through supervised learning, integrating a molecular descriptor encoder to process molecular descriptors computed by the Mordred toolbox. The integrated model learns the relationship between molecular structures and mTP. Stage 3 involves creating a candidate library (12,000 lipids) and using the fine-tuned model to predict the mTP of each lipid. A head and tail-wise ranking method is used to prioritize structurally diverse candidates. The top-ranked lipids are then synthesized and experimentally validated.
Key Findings
AGILE successfully identified H9, an ionizable lipid significantly superior to industry-standard benchmarks (ALC-0315 and MC3) in mRNA delivery efficiency in both in vitro and in vivo (intramuscular injection) experiments. H9 demonstrated remarkable tissue specificity, exhibiting significantly lower mRNA expression in the liver and spleen compared to the benchmarks. In vivo experiments using Cre recombinase mRNA and mTmG reporter mice confirmed the muscle-selective transfection of H9 LNPs. Furthermore, vaccination studies showed comparable anti-OVA IgG titers for H9 and ALC-0315 LNPs, but H9 resulted in lower ALT/AST serum levels, suggesting reduced hepatotoxicity. The model's predictive power was demonstrated by the lower performance of lower-ranked candidates. AGILE's adaptability was shown by its application to macrophage cells (RAW 264.7). Fine-tuning with macrophage mTP data allowed AGILE to identify R6, a lipid optimized for macrophage transfection. R6 LNPs showed a fivefold increase in transfection efficiency compared to H9 and MC3 LNPs in RAW 264.7 cells. Comparative studies highlighted the cell-type specificity of H9 and R6 LNPs. Model interpretation revealed that descriptors such as VSA_EState3 (electronic and steric properties) and SssNH (tertiary amine) were influential for HeLa cells, while SpDiam_Dzi and VR3_D were important for RAW 264.7 cells. The analysis also highlighted the importance of Tail 1 length for macrophage transfection.
Discussion
AGILE successfully addressed the limitations of traditional approaches to ionizable lipid development by combining deep learning with combinatorial chemistry and high-throughput screening. The self-supervised pre-training phase enhanced the model's ability to learn generalizable features of lipid structures. The high-throughput synthesis and screening pipeline enabled the generation of large datasets for model training and validation. AGILE's ability to predict cell-specific preferences for ionizable lipids is crucial for developing tailored LNPs for various applications. The identification of H9 as a superior lipid for muscle-specific mRNA delivery, along with its reduced hepatotoxicity compared to ALC-0315, has significant implications for vaccine development and other therapeutic applications. The discovery of R6 as a highly effective lipid for macrophage transfection opens new avenues for mRNA delivery to immune cells. The model interpretation provided valuable insights into the structure-activity relationships of ionizable lipids, guiding future design efforts.
Conclusion
AGILE represents a significant advancement in accelerating ionizable lipid development for mRNA delivery. The platform's success in identifying high-performing and cell-specific lipids highlights the power of integrating deep learning with combinatorial chemistry. Future work should focus on expanding the training datasets to include in vivo data and a wider range of combinatorial chemistry methods, incorporating advanced generative models to design novel lipids with specific functionalities.
Limitations
The current AGILE model is not generative and relies on pre-existing libraries for prediction. The model's accuracy is dependent on the quality and diversity of the training data, which currently includes crude ionizable lipids from Ugi reactions. There is a potential for overlooking potentially effective lipid candidates due to the standard formulation ratio used in high-throughput screening before individual lipid optimization. Further research is needed to bridge the gap between in vitro and in vivo performance. Although the use of HeLa cells showed similar correlation to muscle cells in initial screening, more investigation in using different cell types is needed. The model's generalizability to lipids outside the training set should be further investigated.
Related Publications
Explore these studies to deepen your understanding of the subject.