Medicine and Health
Discovery of senolytics using machine learning
V. Smer-barreto, A. Quintanilla, et al.
Discover the groundbreaking potential of senolytics in combating aging and diseases! This research, conducted by a team of experts including Vanessa Smer-Barreto and Andrea Quintanilla, unveils three compounds with remarkable senolytic properties, particularly highlighting oleandrin's unparalleled potency. Dive into the innovative use of AI in drug discovery that promises to revolutionize treatment options.
~3 min • Beginner • English
Introduction
The study addresses the challenge of discovering senolytics—agents that selectively eliminate senescent cells implicated in ageing, cancer, and other diseases. Although senescence can have beneficial roles, accumulation of senescent cells contributes to pathology via SASP and other mechanisms. Few senolytics are known and many show cell-type specificity or toxicity, limiting therapeutic utility. Traditional target-based approaches are constrained by limited knowledge of senescence pathways. The authors propose a target-agnostic, machine learning-driven discovery approach leveraging heterogeneous published data to identify novel senolytics cost-effectively. They aim to train models on known senolytics versus diverse assumed negatives, screen large libraries in silico, and validate top hits experimentally across multiple senescence modalities.
Literature Review
Prior senolytics include Bcl-2 family inhibitors (navitoclax, ABT-737), cardiac glycosides (ouabain, digoxin), and BET inhibitors (ARV825, JQ1), often discovered via target-based or phenotypic screens. Many display cell-type specificity and toxicity. AI/ML has impacted drug discovery in bioactivity prediction, virtual screening, target ID, and generative design, with models using fingerprints, learned molecular representations, and phenotypic readouts. In ageing research, ML has been used to find geroprotectors and anti-senescence compounds (e.g., morphology-based CNNs), and bioinformatics has aided target identification. The authors position their work as a phenotypic, target-agnostic ML approach using classic physicochemical descriptors to exploit small, heterogeneous published datasets for senolytic discovery.
Methodology
Data assembly: The training set comprised 2523 compounds: 58 positives (known senolytics from 15 literature sources and a patent) and 2465 assumed negatives from two diverse libraries (LOPAC-1280 and Prestwick FDA-approved-1280). Positives were curated to ensure at least ~60% viability in normal cells with selective elimination of senescent cells in at least one context. Chemical structures were represented as SMILES; 200 RDKit physicochemical descriptors were computed per compound and z-score normalised. Feature selection via random forest Gini importance reduced features to 165. Training data diversity was assessed with k-means (no clear elbow; low silhouette), Tanimoto distance graph (median distance 0.77), and Louvain community detection with adjusted Rand index showing clusters do not reflect source labels, supporting diversity.
Model training and selection: Binary classifiers (SVM, Random Forest) were trained and evaluated with 5-fold cross-validation on the full dataset (165 features) using precision, recall, and F1, mindful that accuracy is unsuitable for imbalanced data. Due to the cost of false positives, models with higher precision were preferred. An ensemble gradient boosting model (XGBoost) was tuned (max_depth via CV) to improve performance, outperforming SVM/RF and a message-passing neural network baseline. Class imbalance handling included class weights and attempts with SMOTE (no improvement). Confusion matrices were generated using a 70/30 stratified split for SVM, RF, and XGBoost. The authors note potential optimism in metrics because feature selection and CV used the full dataset, but prioritised robust model selection for downstream screening.
Computational screening: The final XGBoost model (trained on 70% of data) screened two external libraries totaling 4340 compounds (TargetMol L2100 Anticancer and Selleck L3800 FDA-approved & Passed Phase). Descriptor computation matched training features. The score distribution was highly selective with a long tail; 21 compounds (≈0.4%) exceeded a probability threshold P>44% (≥8 SD above the bulk), selected for experimental validation. t-SNE visualisation indicated strong overlap between training and screening chemical spaces, keeping screening within the model’s applicability domain.
Experimental validation: Two senescence models were used. Oncogene-induced senescence (OIS): IMR90 ER:RAS human fibroblasts induced with 100 nM 4-hydroxytamoxifen; controls were IMR90 ER:STOP. Therapy-induced senescence (TIS): A549 epithelial cancer cells treated with 100 μM etoposide for 48 h, then 3 days recovery. Compounds were tested in 384-well plates with DMSO controls and 10-point half-log dose ranges from 10 μM to 10 nM, in triplicate; after 72 h, cells were fixed and Hoechst-stained; automated imaging counted nuclei as a proxy for survival. Positive controls: ouabain (OIS) and navitoclax (TIS). IC50 values were fit via nonlinear regression; senolytic index defined as control IC50/senescent IC50. Additional assays: crystal violet survival assays, caspase-3/7 activity imaging, intracellular K+ quantification (Asante staining), RT-qPCR for NOXA and senescence/SASP markers. Chemical similarity (Tanimoto) of validated hits to training senolytics was computed using RDKit descriptors.
Key Findings
- Machine learning screen identified 21 candidates (0.4% of 4340) for testing; 3 validated senolytics: ginkgetin (biflavone), oleandrin and periplocin (cardiac glycoside steroid saponins). Hit confirmation rate: 14.28% (3/21).
- OIS model (IMR90 ER:RAS): Positive control ouabain optimal effect at 46.4 nM (IC50 control = 231 nM; IC50 senescent = 28 nM). Newly identified compounds showed selective senolysis:
• Oleandrin: IC50 control = 85 nM; IC50 senescent = 14 nM.
• Periplocin: IC50 control = 300 nM; IC50 senescent = 24 nM.
• Ginkgetin: IC50 control = 26 μM; IC50 senescent = 2.6 μM.
- TIS model (A549 + etoposide): Positive control navitoclax (IC50 control = 10.2 μM; IC50 senescent = 440 nM). Newly identified compounds:
• Oleandrin: IC50 control = 19.5 nM; IC50 senescent = 5.4 nM.
• Periplocin: IC50 control = 267 nM; IC50 senescent = 72.2 nM.
• Ginkgetin: IC50 control = 10.4 μM; IC50 senescent = 5.7 μM.
- Oleandrin exhibited superior low-nanomolar senolytic potency compared to ouabain across OIS, replicative senescence, and TIS, with selective apoptosis induction in senescent cells (increased caspase-3/7 activity) and no proliferation defects in normal cells at 10 nM.
- Mechanistic readouts: Senescent cells showed increased intracellular K+; only oleandrin at 10 nM significantly reduced intracellular K+ in OIS and replicative senescence, indicating stronger Na+/K+ ATPase inhibition at low dose. Oleandrin uniquely induced NOXA mRNA at 10 nM, consistent with activation of a pro-apoptotic pathway implicated in cardiac glycoside senolysis. Surviving cells after oleandrin showed reduced p16/p21 and SASP cytokine (IL1A, IL1B, IL8) mRNA.
- Chemical diversity: More than half of training senolytics were maximally distant in Tanimoto descriptor space from the three new hits, indicating the model identified structurally diverse senolytics.
- Model performance: XGBoost achieved average precision ≈0.70 ± 0.16 (5-fold CV). Screening score distribution was highly selective; top hits were ≥8 SD above bulk. The pipeline reduced experimental screening burden by >200-fold and achieved substantial cost savings.
Discussion
The ML-driven, target-agnostic screen effectively addressed the challenge of discovering senolytics despite limited mechanistic targets and heterogeneous data. By leveraging curated published positives and diverse assumed negatives, the approach identified three selective senolytics validated across distinct senescence modalities, with oleandrin outperforming the benchmark cardiac glycoside ouabain at low nanomolar concentrations. The findings validate that even modest-performing classifiers on imbalanced, heterogeneous data can be valuable when paired with careful feature selection, emphasis on precision to limit false positives, applicability domain checks, and stringent hit thresholds. Mechanistic assays confirmed oleandrin’s superior modulation of Na+/K+ ATPase-related pathways (intracellular K+ reduction) and pro-apoptotic NOXA induction, linking predictions to biological effectors of senolysis. The work underscores the viability of phenotypic, data-driven discovery from published datasets, offering large efficiency gains and expanding chemical starting points beyond well-trodden targets, with potential translational routes for local senolytic therapies.
Conclusion
This work presents a simple, cost-effective ML pipeline trained solely on published data to discover senolytics, identifying ginkgetin, periplocin, and particularly oleandrin as potent, selective agents across OIS, TIS, and replicative senescence. Oleandrin exhibited improved potency and mechanistic activity over ouabain at low nanomolar doses. The approach reduced screening by >200-fold, demonstrating that AI can extract value from small, heterogeneous datasets in a target-agnostic manner and support open science drug discovery. Future directions include in vivo validation, exploration of local administration strategies to mitigate cardiotoxicity risk, medicinal chemistry optimisation of the oleandrin scaffold toward safer senolytics, and extension of the pipeline to broader chemical spaces and other phenotypes.
Limitations
- Training negatives were assumed from LOPAC and Prestwick libraries due to sparse reporting of negative assays; mislabeling is possible.
- Severe class imbalance (58 positives, 2465 negatives) led to generally modest classifier metrics; emphasis was placed on precision to control false positives.
- Feature selection and cross-validation were performed on the full dataset, potentially yielding over-optimistic held-out performance estimates.
- Deep learning featurisation offered limited benefit, likely due to small positive sample size; generalisability may be constrained.
- Validation was in vitro across selected cell models; in vivo efficacy, pharmacokinetics, and safety remain to be established.
- Cardiac glycosides carry cardiotoxicity and narrow therapeutic windows; systemic use is limited, suggesting preference for local delivery in clinical applications.
- Senolytic effects can be cell-type and context specific, which may affect generalisability across tissues and disease settings.
Related Publications
Explore these studies to deepen your understanding of the subject.

