logo
ResearchBunny Logo
De novo generation of multi-target compounds using deep generative chemistry

Medicine and Health

De novo generation of multi-target compounds using deep generative chemistry

B. P. Munson, M. Chen, et al.

Discover how POLYGON, developed by Brenton P. Munson and colleagues, harnesses generative reinforcement learning to design polypharmacology drugs that inhibit multiple protein targets. With impressive results in synthesizing compounds, this innovative approach holds promise for the future of drug design.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the challenge of designing single small molecules that potently modulate multiple protein targets (polypharmacology). Traditional drug discovery typically follows a one disease–one target–one drug paradigm, which is often insufficient for complex, polygenic diseases such as cancer and psychiatric disorders. Polypharmacology can offer advantages over combination therapies, including improved pharmacokinetics and safety, reduced resistance, and simpler formulations that enhance patient compliance. However, rational design of multi-target compounds has been difficult, with most discoveries occurring serendipitously. Recent machine learning advances in target interaction prediction and de novo single-target molecule generation suggest an opportunity for systematic design of multi-target inhibitors. The central question is whether a generative reinforcement learning framework can de novo design small molecules that simultaneously inhibit two desired protein targets, and whether such compounds show predicted binding and functional activity in cellular contexts.
Literature Review
Prior work demonstrates the promise and challenges of polypharmacology in oncology and other diseases. Examples include successful polypharmacology strategies in KRAS mutant non-small cell lung cancer and discovery of dual RET/VEGFR2 inhibitors, though often requiring substantial resources to identify suitable scaffolds. Computational methods have advanced de novo molecular design (e.g., VAEs, reinforcement learning frameworks like MOSES and GuacaMol) and target interaction prediction (BindingDB, Pharos; DREAM IDG drug-kinase binding challenge). There is growing evidence that multi-target drugs can outperform combinations due to pharmacokinetic and resistance benefits, but systematic, algorithmic discovery of dual inhibitors remains underdeveloped. Synthetic lethality mapping and resources like the Cancer Dependency Map motivate pairing targets for dual inhibition.
Methodology
POLYGON is a deep generative and reinforcement learning pipeline for de novo polypharmacology compound design. Chemical embedding and VAE: A variational autoencoder (VAE) based on PyTorch (1.4.0) with GRU-RNNs encodes SMILES strings into a continuous chemical embedding and decodes embeddings back to valid molecules. SMILES are padded to length 100 and embedded into 128-d features (torch.nn.Embedding). Encoder: GRU (1 layer, 256-d output, dropout 0.2) followed by Linear (128-d) producing μz and σz. Latent sampling uses reparameterization z = μz + ε exp(σz/2), ε ~ N(0,1). Decoder: Linear (512-d) → GRU (3 layers, 512-d output) → Linear (55-d) → softmax to autoregressively generate SMILES characters. Training used ~1.27M ChEMBL molecules (ChEMBL 24) with 238,706 for validation, 200 epochs, Adam optimizer, batch size 1024, gradient clipping 50. The learned embedding was visualized by PCA. The VAE reconstructs held-out molecules and decodes random latent points into valid SMILES with high validity. Using 18,763 compound–target affinities across 24 kinase targets (BindingDB/Pharos), compounds sharing targets are significantly closer in embedding space; multiclass target prediction AUCs ranged 0.76–0.95 with mean accuracy 0.85 ± 0.05. Reinforcement learning (RL) for dual-target optimization: POLYGON iteratively samples chemical space and refocuses on high-scoring regions. Each of 200 RL cycles samples 8192 latent coordinates, decodes to molecules, and scores each by six rewards: (r1, r2) predicted ligand efficiency/pIC50 against the two targets; (r3, r4) Euclidean distance in embedding to the 20 nearest known ligands for each target (from BindingDB/Pharos); (r5) synthetic accessibility (SA) score; (r6) drug-likeness (QED). Rewards are normalized to [0,1] via half-Gaussian scaling (thresholds in Supplementary Table 1), combined by averaging into a single score, and the top 4096 molecules are used to further fine-tune the VAE (additional 2 epochs, batch size 512) and to define a refocused sampling subspace for the next epoch. Compound–target scoring module: For target activity prediction, random forest regressors (scikit-learn, 1000 trees) predict ligand efficiency from 2048-bit Morgan fingerprints (radius 2). Training data comprised experimentally measured IC50 values from Pharos/BindingDB with 1146 ligands for MEK1 and 5315 for mTOR. Ligand efficiency was computed from IC50 and heavy-atom count. Cross-validation assessed performance. The module was also evaluated via the IDG-DREAM drug–kinase challenge by retraining to predict Kd, achieving Spearman correlations of 0.46 and 0.45 in Rounds 1 and 2 (top ~10% of models). Benchmarking polypharmacology prediction: To test recognition of dual activity in existing data, 109,811 BindingDB compounds assayed against exactly two targets were scored; predicted and observed dual activity (IC50 < 1 μM for both targets) were compared to compute accuracy and odds ratio across thresholds. De novo generation and docking: POLYGON generated top-scoring compounds for ten synthetic-lethal protein pairs spanning serine/threonine kinases, tyrosine kinases, histone-binding proteins, and DNA-binding proteins. For each pair, the top 100 compounds were docked using AutoDock Vina (v1.1.2; exhaustiveness 8; max energy difference 8 kcal/mol) and UCSF Chimera (v1.16), using PDB crystal structures (MEK1-trametinib: 7M0Y; mTOR-FRB/FKBP12-rapamycin: 3FAP; PARP1-olaparib: 7KK4; BRD4-JQ1: 3MXF; CDK7: 6XD3; CDK9: 6Z45; CDK12: 7NXK; PRMT5: 6RLQ; ERBB2: 7PCD; FGFR3: 6LVM; TOP1: 1TL8). Search volumes encompassed the entire structure. Synthesis and experimental validation: From the top 100 MEK1/mTOR candidates, 32 were synthesized (Bioblocks Inc.), prioritizing minimal synthetic steps (anilines enriched). A549 lung cancer cells (ATCC) were used for viability assays (CellTiter-Glo, 72 h), dose–response IC50 estimation, and polypharmacology validation via capillary westerns measuring phospho-P70 S6 kinase (mTOR activity) and phospho-ERK (MEK1 activity) after 3 h exposure at 1 μM and 10 μM. Positive controls: trametinib (MEK inhibitor) and MK-8669 (mTOR inhibitor); negative controls: two ChemBridge compounds. Synergy of single-target inhibitors was confirmed using the Loewe model. Cell-free kinase inhibition was measured with HotSpot assays for mTOR/FRB12 and MEK1. Off-target profiling included westerns for PDK1, ATR, RAF and a 371-kinase panel screen for IDK12038 (1 μM ATP, 10 μM compound).
Key Findings
- POLYGON’s compound–dual-target prediction recognized polypharmacology with high accuracy in BindingDB: accuracy 81.9% at IC50 < 1 μM with odds ratio 21.3 (p = 2.2×10^-16); abstract reports 82.5% accuracy. - De novo generation for ten synthetic-lethal protein pairs produced top-100 candidate sets per pair whose docking ΔG values were significantly more favorable than random BindingDB ligands (p < 1×10^-5 for each pair). Mean ΔG shift across pairs was −1.09 kcal/mol (one-sided t test = −4.285; DOF = 7146; p = 9.25×10^-5; 95% CI −1.21 to −0.98). - For MEK1/mTOR, docking recapitulated canonical inhibitor poses: trametinib in MEK1 ΔG −9.2 kcal/mol vs in mTOR −7.4; rapamycin in mTOR ΔG −8.6 vs in MEK1 −3.7. POLYGON compound IDK12008 docked favorably to both MEK1 (ΔG −8.4) and mTOR complex (ΔG −9.3) with orientations similar to canonical ligands. - Of 32 synthesized MEK1/mTOR candidates, most reduced both kinase activities by >50% at 10 μM and decreased A549 cell viability by >50% in the 1–10 μM range. Reductions in mTOR and MEK1 activity correlated with growth inhibition (Pearson P_MTOR = 0.47, p = 0.0032; P_MEK1 = 0.45, p = 0.0049; n = 32). - Four lead compounds (IDK12008, IDK12038, IDK12058, IDK12065) reduced phosphorylation of both targets by >50% at 1 μM in cells (3 replicates; p < 0.05). - Cell-free assays confirmed inhibitory activity of IDK12008 against MEK1 and mTOR at concentrations consistent with cell-based observations. - Off-target assessments: minimal inhibition (>50% activity retained for 330/371 kinases for IDK12038; representative kinases PDK1, ATR, RAF largely unaffected, with one instance of 38% ATR reduction by IDK12065). Promiscuity comparable to approved kinase inhibitors. - The POLYGON scoring component ranked in the top ~10% of models in the IDG-DREAM drug–kinase binding challenge (Spearman 0.46/0.45). An external top-performing model (AIWIC) predicted Kd < 1 μM for one target for most IDK compounds and for both targets in ~20% of IDKs.
Discussion
The work demonstrates that a generative reinforcement learning framework can systematically produce small molecules predicted to inhibit two targets simultaneously and that selected outputs display favorable docking and experimental activity. By embedding chemical space with a VAE and iteratively refocusing sampling using multi-objective rewards (dual-target potency, proximity to known ligands, synthesizability, and drug-likeness), POLYGON efficiently explores chemical regions likely to yield dual inhibitors. Validations across ten synthetic-lethal target pairs show consistently favorable docking energies compared to random ligands, supporting target engagement predictions. Experimental synthesis and testing of MEK1/mTOR candidates confirmed dual pathway inhibition and associated reductions in cell viability, with correlation between molecular (phosphorylation) and cellular outcomes. These findings support the feasibility and potential of generative AI to accelerate early-stage polypharmacology design, providing a starting point for medicinal chemistry optimization and suggesting a complementary approach to combination therapies with potential benefits in pharmacokinetics, resistance, and patient adherence.
Conclusion
POLYGON provides an end-to-end pipeline for de novo design, selection, synthesis, and validation of dual-target small molecules. It achieves high accuracy in recognizing dual activity in existing datasets, generates candidates with favorable docking profiles across diverse synthetic-lethal target pairs, and yields experimentally validated MEK1/mTOR inhibitors with dual pathway inhibition and growth suppression in cells. Future work should integrate ADMET optimization, include negative rewards for predicted off-target activities, incorporate structural information of both intended and unintended targets to further improve potency and selectivity, and adopt iterative SAR-driven retraining to enhance efficacy. The approach can be extended to additional genetic dependencies revealed by large-scale functional genomic and chemogenomic screens.
Limitations
POLYGON, as presented, focuses on early-stage design without explicit optimization for ADMET properties. Off-target effects cannot be fully excluded: kinome profiling indicates some promiscuity, although comparable to approved kinase inhibitors, and limited representative kinase testing revealed a modest ATR effect for one compound. Differences between cell-based and biochemical potencies (e.g., for mTOR) suggest potential cellular context effects or off-target contributions. Docking simulations searched entire protein structures and provide indirect evidence of binding; experimental structural confirmation was not performed. Training data and model performance may be biased toward well-characterized kinases with abundant ligand data. The synthesis set (32 compounds) was constrained by synthetic feasibility, potentially limiting scaffold diversity.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny