Medicine and Health

DrugnomeAI is an ensemble machine-learning framework for predicting druggability of candidate drug targets

A. Raies, E. Tulodziecka, et al.

Discover DrugnomeAI, a groundbreaking machine learning framework that predicts druggability for every protein-coding gene in the human exome. Developed by a team of experts including Arwa Raies and Ewa Tulodziecka from AstraZeneca, this tool integrates extensive gene-level data to enhance drug target selection and demonstrates impressive predictive accuracy.

00:00

~3 min • Beginner • English

Index

Introduction

The study addresses how to predict the druggability of human genes to improve target selection in drug discovery. Druggability is defined as the likelihood that modulation of a protein by a therapeutic modality will elicit a desired clinical effect, distinct from ligandability which concerns only binding. Traditional approaches face challenges due to few known positives, severe class imbalance, and lack of reliable negatives. The authors propose DrugnomeAI, a stochastic semi-supervised learning framework that integrates diverse gene- and systems-level features to rank every human protein-coding gene by druggability. The goal is to provide disease-agnostic and domain-specific predictions (by disease area and modality) to guide target prioritization and broaden understanding of the druggable genome, including emerging modalities like PROTACs.

Literature Review

Multiple resources exist for assessing target tractability/druggability. Curated databases include Open Targets and TractaViewer. Computational prediction tools span gene-level annotation models (e.g., TargetDB using random forests), sequence-based models (DrugMiner; hybrid deep learning by Yu et al.; SVM bagging by Lin et al.), systems-level network models (decision-tree meta-classifier by Costa et al.), and binding site-focused tools (TRAPP, BiteNet, eFindSite, TACTICS). Some methods are disease- or modality-specific (e.g., oncology-focused predictors, kinase-inhibitor tractability). This work expands prior efforts by integrating a comprehensive multi-source feature set and providing disease-agnostic and domain-specific (oncology vs non-oncology; small molecule, antibody, PROTAC) models, including what the authors note as the first ML model for gene druggability in the context of PROTAC therapeutics.

Methodology

Framework: DrugnomeAI builds on the mantis-ml stochastic semi-supervised learning approach for positive–unlabelled data. The human exome (19,846 genes) is split into many balanced sets comprising known positives (seed genes) and unlabelled genes. For each stochastic iteration, an ensemble is trained with stratified 10-fold cross-validation. Final per-gene scores are the average of out-of-bag prediction probabilities across models and iterations. Labels: Positives were drawn from Pharos (Tclin: 610 targets of approved drugs with known MoA; Tchem: 1,592 targets with activities in ChEMBL/DrugCentral; also Tbio, Tdark noted) and from the Triage scheme (Tier1: 1,411 approved/clinical-phase targets; Tier2: 658; Tier3A: 845; Tier3B: 1,437). Disease- and modality-specific models used Open Targets gene sets for oncology (CPD) vs non-oncology and for small molecules and antibodies, plus PROTAC targets from Schneider et al. Features: 324 features from 15 sources after preprocessing. Druggability-specific sources include Pharos (antibodies, interactions, tissue specificity, sequence size, interaction types, target claims), DGIdb (number of interaction types), CTD (counts for 65 chemical–gene interaction types, number of unique interactions, pathway associations as 238 boolean features plus counts of other pathways), InterPro (97 domain/family/superfamily flags), STRING and InWeb (PPI networks), and Reactome (network associations). Generic sources from mantis-ml include gnomAD, ExAC, genic intolerance metrics, GWAS, MGI essentiality, OMIM disease counts, among others. Network feature engineering computed, for each gene, ratios of interactions with positively labelled (known druggable) genes among 1-hop and 2-hop neighbors in STRING, InWeb, and Reactome networks. Preprocessing: Highly correlated features (Pearson r>0.8) were reduced; features with >99% missingness were dropped. Remaining missing values were imputed with zero for binary/sparse biological signals and with median for continuous variables; network non-links were set to zero. Features were standardized to zero mean and unit variance. Model selection and feature selection: Four classifiers were evaluated (Random Forest, Extra Trees, SVM, Gradient Boosting) across feature-set variants: InterPro only; Pharos+InterPro (druggability-specific); All (druggability) + minimal generic; and All+Mantis. AUCs using Pharos+InterPro matched or exceeded larger sets, so it was adopted as default to reduce redundancy. Gradient Boosting consistently outperformed other classifiers across label and feature-set combinations (significant by DeLong tests). GB hyperparameters were tuned: max_depth=5, n_estimators=200, learning_rate=0.1 (others inherited from mantis-ml). Boruta feature selection (on Tclin and Tier1) identified predictive features, with PPI-derived features (DGIdb interaction types, InWeb/STRING/Reactome overlap metrics), CTD unique interactions/pathways, monoclonal count, sequence length, and tissue specificity ranking highly. Specialized models: Built modality-specific (small molecule, antibody, PROTAC) and oncology vs non-oncology models: CPD-sm (699), CPD-ab (175), non-CPD-sm (186), non-CPD-ab (76), cancer-sm (322). Four classifiers tested per specialized model; GB chosen for best performance (AUCs ≥0.94). Predictions were analyzed for novelty and biological plausibility (e.g., cellular localization for antibody targets). Validation: - Clinical evidence from Open Targets (known_drug and clinical trial phase, molecule type) used to assess enrichment across ranked bins. - Non-clinical evidence (genetic, pathways, animal models, RNA expression, somatic mutation, literature) used for enrichment of top predictions lacking clinical data via Fisher’s exact test against random gene sets. - PheWAS: overlap/enrichment of top 5% DrugnomeAI genes (with clinical evidence) with genome-wide significant genes (p<5×10^-8) for binary and quantitative traits from UK Biobank exome PheWAS (~450K). - OMIM: enrichment of OMIM-associated genes among top-ranked predictions. - Benchmarking: Compared overlap/enrichment of top 5% predictions versus TargetDB (RF), Yu et al. (hybrid deep learning), and Costa et al. (decision-tree meta-classifier) using validation sets: Open Targets tractability buckets for small molecules and antibodies and King et al. approved drug targets. Stepwise hypergeometric enrichment and AUC under enriched region quantified performance. Compute: Runs used 200 GB RAM and 4 cores; 2h40m (Tclin) and 3h12m (Tier1). A web application hosts predictions and feature context.

Key Findings

- Performance and features: Gradient Boosting significantly outperformed RF, Extra Trees, SVM and DNN across label sets (e.g., DeLong p values vs RF as low as 4.34×10^-18 for Tclin; vs Extra Trees 1.01×10^-18; vs SVC 2.44×10^-10; vs DNN 6.58×10^-12). Top features by Boruta were dominated by protein–protein interaction/network-derived metrics (DGIdb interaction types; overlaps in InWeb, STRING, Reactome), CTD interactions/pathways, monoclonal count, and sequence length. - Reference models: Highest AUCs achieved with Tclin and Tier1 labels (AUC≈0.99 and 0.97, respectively). - Clinical enrichment: Top 5% genes (n=992) by DrugnomeAI-Tclin are highly enriched among genes selected for clinical development (Odds Ratio=132.78; Fisher’s p<1×10^-308). 753 genes (63%) in top 5% had prior clinical programs; additional 268 in the 5–10% bin. DrugnomeAI-Tier1 showed similar enrichment. CDF analysis: top 25% ranked genes account for 95% of genes with clinical evidence; 80% of clinically supported genes are in the top 10% of rankings. Among top 5%: 76% (Tclin) and 61% (Tier1) had been selected for clinical development; 627 (63%) and 475 (48%) were targeted by small molecules; 501 (51%) and 346 (35%) had progressed to phase IV (Tclin and Tier1 respectively). - Non-clinical support for novel targets: Among top 5% predictions without clinical data, 239 (Tclin) and 386/387 (Tier1) had ≥2 types of non-clinical evidence (Tclin p=1.5×10^-8; Tier1 p=5.2×10^-10). Many had 3–5 support types; 24 (Tclin, p=2.4×10^-3) and 26 (Tier1, p=1.3×10^-2) had all six evidence types. Across all ranks, top predictions were enriched for genetic evidence (Tier1 OR=5.8, p=9.35×10^-38; Tclin OR=4.6, p=4.64×10^-32) and other evidence types. Feature distributions differed between genes with vs without clinical evidence for top features (e.g., monoclonal count, antibody count, sequence length, DGIdb interaction types; p<1×10^-308). - PheWAS enrichment: Top 5% DrugnomeAI-Tclin (with clinical evidence) enriched among UKB exome PheWAS hits—binary traits OR=2.9, p=1.69×10^-5; quantitative traits OR=2.5, p=1.56×10^-7. Tier1: binary OR=3.0, p=4.63×10^-5; quantitative OR=3.0, p=9.53×10^-10. - OMIM enrichment: OMIM-associated genes were enriched in top 5% (Tclin OR=4.6, p=6.05×10^-110; Tier1 OR=3.6, p=6.55×10^-77); 506 (51%) and 452 (45%) top-5% genes (Tclin, Tier1) had OMIM disease associations. - Benchmarking: DrugnomeAI-Tclin’s top 5% overlapped validation datasets more than TargetDB (+35%), Costa et al. (+29%), and Yu et al. (+149%). Overlap with King et al. approved targets was significantly higher than TargetDB (OR=2.3, p=3.9×10^-15), Yu et al. (OR=17.2, p=3.6×10^-79), and Costa et al. (OR=1.9, p=2.3×10^-10). Hypergeometric AUC analyses showed DrugnomeAI-Tclin strongest enrichment in several cases (e.g., small-molecule Bucket 1 AUC 23× higher than TargetDB/Costa; Buckets 1–3 AUC 10–13× higher). - Modality-specific models: Gradient Boosting achieved AUCs of 0.99 (small molecule), 0.98 (antibody), and 0.97 (PROTAC). Antibody-model predictions were under-represented for exclusively intracellular proteins (182/1181, OR=0.17, p=3.1×10^-139), consistent with modality constraints. PROTAC model strongly enriched for external PROTACtable genes (287/1067 in top 5%, OR=9.5, p=6.7×10^-138). - Oncology vs non-oncology models: GB AUCs were 0.99 (CPD-sm), 0.98 (CPD-ab), 0.98 (non-CPD-sm), 0.98 (non-CPD-ab), 0.96 (cancer-sm). Protein–protein interaction and CTD pathway features remained top contributors.

Discussion

DrugnomeAI demonstrates that integrating systems-level features, particularly protein–protein interaction network context, enables highly accurate prediction of gene druggability despite limited positive labels and no true negatives. The strong enrichment of top-ranked genes among clinically pursued targets, OMIM-associated genes, and UKB PheWAS hits supports biological and translational relevance. Superior benchmarking performance versus existing tools indicates improved capture of tractability signals. Modality- and disease area–specific models reflect known biological constraints (e.g., extracellular preference for antibodies) and highlight novel, non-clinically pursued genes with substantial orthogonal evidence. These findings address the core question by providing robust, exome-wide prioritization scores and interpretable feature contributions, aiding target selection and hypothesis generation across therapeutic areas and modalities. The web application further enhances interpretability and practical use.

Conclusion

The study introduces DrugnomeAI, a stochastic semi-supervised ensemble framework that produces exome-wide druggability likelihoods using 324 features from 15 data sources. Gradient Boosting models trained on curated label sets (Pharos Tclin; Triage Tier1) achieve state-of-the-art performance, with predictions strongly enriched for clinically pursued targets, OMIM disease genes, and UKB PheWAS hits, and outperform comparable methods. Specialized models for small molecules, antibodies, and PROTACs, and for oncology vs non-oncology, capture modality- and domain-specific determinants of druggability and suggest novel high-priority targets. Future work could incorporate graph neural networks and pocket-level features to capture higher-resolution interaction and binding information, expand modality coverage, and better prioritize under-studied genes by learning directly from sequences and structures.

Limitations

- Potential bias toward well-annotated and historically targeted genes due to reliance on annotation-rich features and training labels derived from approved/clinical-phase targets. - Under-representation of under-studied genes may limit discovery of unconventional mechanisms. - Druggability is context-dependent (disease, modality, pharmacodynamics/kinetics, safety, regulatory, and commercial factors) and not entirely captured by current features. - Limited ability to capture pocket-level or epitope-specific determinants; kinase/domain and ligand-binding features were not top-ranked, possibly due to label set size and feature granularity. - Directionality of modulation (activation vs inhibition) is not explicitly modeled.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Design and Analysis of a Deep Learning Ensemble Framework Model for the Detection of COVID-19 and Pneumonia Using Large-Scale CT Scan and X-ray Image Datasets

X. Xue, S. Chinnaperumal, et al.

Chemistry

Testing the predictive power of reverse screening to infer drug targets, with the help of machine learning

A. Daina and V. Zoete

Medicine and Health

Expanding drug targets for 112 chronic diseases using a machine learning-assisted genetic priority score

R. Chen, A. Duffy, et al.

Engineering and Technology

An active learning framework for the rapid assessment of galvanic corrosion

A. Venkatraman, R. M. Katona, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny