Medicine and Health

Single-cell transcriptomes identify patient-tailored therapies for selective co-inhibition of cancer clones

A. Lanevski, K. Nader, et al.

Discover how scTherapy, developed by Aleksandr Lanevski and colleagues, uses machine learning and single-cell transcriptomic profiles to redefine cancer treatment strategies. This innovative approach outputs tailored multi-targeting therapies that show not only efficacy but also low toxicity for patients battling advanced malignancies like AML and HGSC.... show more

Introduction

The study addresses the clinical challenge posed by intratumoral cellular heterogeneity and clonal evolution, which drive therapy resistance in hematologic malignancies and solid tumors. Single-cell RNA sequencing (scRNA-seq) enables high-resolution characterization of malignant and non-malignant populations and can inform precision oncology, but translational strategies that use single-cell data to nominate selective multi-targeting therapies are scarce. Existing computational approaches link scRNA-seq with diagnosis or prognosis and sometimes with drug response, yet they do not identify multi-targeting drugs or combinations at single-cell and patient-specific levels while accounting for both patient and disease heterogeneity. Large-scale ex vivo drug testing is impractical for many tumors. The authors propose scTherapy, a machine learning framework that, using only a patient’s scRNA-seq count matrix, predicts dose-specific treatments that selectively co-inhibit genetically distinct cancer clones while sparing normal cells.

Literature Review

The paper situates its work within evidence of tumor heterogeneity affecting outcomes across cancers (e.g., AML clonal hierarchies and TME features in solid tumors). It notes the maturation of scRNA-seq and its clinical potential but also practical barriers (robustness, scalability, cost). Prior computational methods (e.g., BeyondCell, scDrug, scDR, DEGAS) have linked scRNA-seq to drug response or defined therapeutic clusters but lack capabilities to predict selective multi-targeting combinations, dose-specific effects, or account for toxicity using matched normal cells. Traditional HTS and in vivo approaches are limited in capturing patient-specific, dose-dependent, and subclone-selective responses. The authors leverage large perturbational resources (LINCS 2020, PharmacoDB) to overcome these gaps.

Methodology

Overview: scTherapy is a pre-trained LightGBM model that integrates large-scale perturbational transcriptomics (LINCS 2020) with viability data (PharmacoDB) and compound structures (ECFP4 fingerprints) to predict, for a given patient’s scRNA-seq-derived differential gene expression, dose-specific inhibition for thousands of compounds. It suggests individualized monotherapies or two-drug combinations to co-inhibit major malignant subclones while minimizing toxicity to matched normal cells. Training data and model pre-training: The model leverages 394,303 transcriptomic profiles corresponding to 19,646 single-agent responses across multiple doses in 167 cell lines (LINCS 2020). These were matched to dose-response viability from PharmacoDB in the same 167 lines (10,303 overlapping compound–cell line pairs; qc-pass profiles). Compound structures were encoded as ECFP4 fingerprints (from PubChem/SMILES via RDKit). The final training set covers 3,695 unique compounds tested at 1–35 doses in 167 lines. Predictors include drug-induced transcriptomic fold-changes (DEGs), ECFP4 fingerprints, and dose; the target is percent inhibition interpolated from PharmacoDB dose-response curves. LightGBM was tuned via Bayesian optimization with repeated cross-validation (three repetitions) and ten-fold inner CV to ensure robustness and generalizability. Patient-level pipeline (AML case, Steps 1–5):

Step 1: Longitudinal sampling of AML bone marrow aspirates (12 samples spanning diagnosis, relapse, refractory stages) under IRB approvals.
Step 2: Single-cell processing with Seurat (v4.3.0), Louvain clustering, QC (mitochondrial UMI fraction thresholds: AML 10%; HGSC organoids 20%), normalization via LogNormalize.
Step 3: Identification of malignant vs normal cells via ensemble of ScType (marker-based annotation with confidence scores), CopyKAT (aneuploidy/CNA calling), and SCEVAN (ploidy state modeling). Majority voting yields malignant/normal clusters. Stability against downsampling was assessed.
Step 4: Subclonal architecture via inferCNV (i6 HMM model, cutoff 0.1 for 10x data), subclustering of HMM CNV states, and phylogeny visualization using Uphyloplot2. Two broad subclones are chosen based on the clonal tree (branch length proportional to subclone fraction). Differential expression (Wilcoxon rank-sum in Seurat) between each subclone and normal cells yields subclone-specific DEGs.
Step 5: Predictive modeling: Subclone-specific DEGs feed the pre-trained LightGBM to predict percent inhibition for each compound–dose pair. Predictions can use predefined dose grids or custom compounds/doses (via ECFP4). Conformal prediction filters out low-confidence outputs (non-conformity score < 0.8 excluded). A 1 µM maximum dose constraint is applied to enhance clinical feasibility and limit toxicity. For combinations, drugs are chosen to uniquely target each of the two major subclones while minimizing predicted inhibition of normal cells, generating dose-specific drug–drug suggestions. Validation experiments:
AML single-agent retrospective validation: Matched model-predicted effective/ineffective drugs and doses against existing whole-well CellTiter-Glo viability data (not used in model training). Statistical comparisons via Wilcoxon tests.
AML combination prospective validation: Four AML patient samples with sufficient cells were tested using bulk 4×4 dose matrices; synergy quantified by ZIP score around predicted effective doses. Top six combinations per patient were further assessed by high-throughput flow cytometry to quantify selective inhibition in malignant versus normal subpopulations (markers including CD14, CD56, CD45, CD19, CD3, CD34, CD38, CD117, Annexin V, DRAQ7).
HGSC organoids: For three HGSC patients (organoids PAX8+ tumor cells; PAX8− stromal cultures), scRNA-seq guided monotherapy predictions (subclones not resolved due to low malignant cell proportion). DEGs between PAX8+ and PAX8− cells served as input. Eighteen top agents (from 372 overlapping compounds) were validated in 3D organoids (7-day CTG 3D) and parallel PAX8− cells (CTG 2.0). For one case, PAX8− controls were integrated from additional samples via Seurat integration to ensure sufficient normal cells. Performance benchmarking: scTherapy was compared against BeyondCell and scDrug on 12 AML patients using drug sensitivity score (DSS) across doses for top/bottom 15 predictions per method. ROC/AUC analysis with DeLong tests quantified discrimination between effective and ineffective monotherapies. Pan-cancer application: Applied to AML, HGSC, LUAD, PDAC, and TNBC scRNA-seq cohorts (curated by Gavish et al.) to map recurrence of predicted therapies within and across tumor types, reporting fractions of patient-specific, disease-specific, and pan-cancer predictions. Implementation and availability: R code, Docker images, and processed objects are publicly available; inputs limited to a patient scRNA-seq count matrix, with automated to semi-automated downstream steps.

Key Findings

Validation of efficacy and selectivity:
- AML single-agent assays: Treatments predicted as effective showed significantly greater inhibition ex vivo than those predicted ineffective (p < 0.001, two-sided Wilcoxon test).
- AML combinations: All predicted two-drug combinations exhibited either synergistic (ZIP > 10) or additive (0 < ZIP < 10) effects in bulk viability matrices; measured ZIP scores were significantly above zero (p < 0.001, Wilcoxon). Example ZIP values: 13.6 and 13.5.
- Flow cytometry in AML (top combinations): Significant differential inhibition favoring malignant over normal cells in three patients (p < 0.01, < 0.05, < 0.05; patient 12 n=2 replicates, not tested for significance). 83.4% of predictions showed low toxicity (<50% inhibition) in normal cells; some classes (e.g., proteasome, topoisomerase inhibitors) were non-selective.
- HGSC organoids: Across three patients and 54 evaluated treatments, 31/54 (57.4%) achieved >50% inhibition in PAX8+ tumor cells, while only 11/54 (20.4%) had similar inhibition in PAX8− normal cells. Predicted treatments consistently showed higher efficacy in tumor vs normal cells (p ≤ 0.01, Wilcoxon). Predicted effective vs ineffective sets differed significantly in PAX8+ cells (p < 0.001, Wilcoxon).
Pan-cancer landscape: Among predicted treatments across five cancer types, 19% were patient-specific, 25% disease-specific (2% LUAD, 1% TNBC, 2% PDAC, 10% AML, 10% HGSC), and 22% common across all five cancer types. 17% (22/131) of predicted treatments are in phase 3 or 4 clinical trials.
Model performance vs state-of-the-art:
- scTherapy outperformed BeyondCell and scDrug in predicting most/least effective monotherapies (DSS-based comparisons across 12 AML patients).
- ROC/AUC: scTherapy AUC = 0.715 (95% CI 0.661–0.768), scDrug AUC = 0.573 (0.512–0.634), BeyondCell AUC = 0.527 (0.468–0.587). DeLong tests: scTherapy significantly better than scDrug and BeyondCell (p < 0.01); scDrug vs BeyondCell not significantly different (p = 0.29).
Practical outputs: scTherapy produced dose-specific, clone-selective predictions using only patient scRNA-seq. Conformal prediction filtering increased confidence; a 1 µM dose cap enhanced clinical relevance. In AML, MEK and PLK inhibitor–based options frequently emerged; in HGSC, fewer targeted signal transduction inhibitors were predicted, reflecting disease biology.
Overall: 96% of multi-targeting treatments showed selective efficacy or synergy; 83% displayed low toxicity to normal cells, supporting translational potential.

Discussion

The findings demonstrate that integrating scRNA-seq–defined malignant subclones with a large perturbational reference enables selective, dose-aware prediction of patient-tailored therapies. scTherapy addresses key translational gaps by (i) predicting combinations that co-inhibit multiple subclones while sparing matched normal cells, (ii) providing dose-specific outputs to manage toxicity, and (iii) functioning with only an scRNA-seq count matrix, making it applicable to tumor types where ex vivo testing is constrained. Experimental validations in AML confirmed synergistic or additive combination efficacy and selective inhibition of malignant populations. In HGSC organoids, a majority of predictions caused strong tumor inhibition with substantially fewer effects on stromal cells, underscoring selectivity. Compared to methods like BeyondCell and scDrug, scTherapy showed superior accuracy against measured ex vivo responses, likely due to its supervised training on matched transcriptomic and viability responses and explicit modeling of dose. The pan-cancer analysis revealed both recurrent therapy clusters within tumor types and a significant fraction of patient-specific options, highlighting heterogeneity and the need for individualized regimens. Clinically, many predicted agents are already in late-stage trials, improving feasibility. The approach aligns with the precision oncology goal of maximizing efficacy while minimizing toxicity at the individual level by leveraging clone-specific DEGs and dose constraints. It also provides interpretable biomarkers via gene-to-target networks. Nonetheless, ex vivo confirmation remains important, as some multi-target classes can be non-selective, and dose translation to clinic must consider patient-specific factors.

Conclusion

This work introduces scTherapy, a scalable framework that uses single-cell transcriptomes to predict dose-specific, patient-tailored monotherapies and combinations that selectively co-inhibit malignant cell subclones while minimizing effects on normal cells. Validations in AML and HGSC demonstrate high rates of selective efficacy, synergy, and low toxicity, and benchmarking shows improved predictive performance over existing single-cell drug response methods. The pan-cancer survey reveals a balance of disease-recurrent and patient-specific therapies, supporting both precision and broader applicability. Future directions include incorporating multi-omics (e.g., point mutations, measured CNVs), expanding to multi-drug (beyond two-drug) combinations and additional subclones, integrating morphological and proteomic response profiles, refining drug class–specific modeling and time points, and improving validation models with better-matched healthy controls. These enhancements aim to further improve accuracy, interpretability, and clinical translatability.

Limitations

Input modality limitation: Current predictions rely solely on scRNA-seq; cases where actionable mutations (e.g., BRAF V600E) are primary predictors may not be optimally addressed until multi-omics integration is added.
CNV inference: Subclonal architecture is derived from inferred CNVs (inferCNV) rather than measured genomic CNVs, which may miss or misrepresent certain alterations.
Subclone resolution constraints: When malignant cells are few (e.g., some HGSC samples), reliable subclone delineation is difficult; analysis reverts to monotherapy predictions using pooled malignant cells.
Model domain shift: Training used 2D cell line perturbational data; validation includes ex vivo 2D/3D systems with different growth dynamics and assay durations, potentially contributing to discrepancies.
Toxicity and dose translation: Some predicted treatments (e.g., proteasome/topoisomerase inhibitors) can be non-selective; clinical dosing requires adjustment for patient-specific factors (age, stage, comorbidities).
Drug class variability: Different target classes (e.g., HDAC inhibitors) induce broad transcriptomic changes, which may affect prediction consistency; future class-specific calibrations may be needed.
Current combination scope: Combinations target two major subclones; extension to more subclones and multi-drug regimens is planned but not yet implemented.

Related Publications

Explore these studies to deepen your understanding of the subject.

Chemistry

Single-atom tailored atomically-precise nanoclusters for enhanced electrochemical reduction of CO₂-to-CO activity

Y. Wang, F. Yan, et al.

Biology

Biology of Circulating Tumor Cells through Single-Cell RNA Sequencing: Implications for Precision Medicine in Cancer

S. Orrapin, P. Thongkumkoon, et al.

Medicine and Health

Single-cell sequencing unveils key contributions of immune cell populations in cancer-associated adipose wasting

J. Han, Y. Wang, et al.

Chemistry

Dual Activation of CO₂ on a Single-Atom Photocatalyst for Efficient CO₂ Reduction

F. Sun, C. Li, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny