logo
ResearchBunny Logo
Proteomic profiling based classification of CLL provides prognostication for modern therapy and identifies novel therapeutic targets

Medicine and Health

Proteomic profiling based classification of CLL provides prognostication for modern therapy and identifies novel therapeutic targets

T. L. Griffen, F. W. Hoff, et al.

Discover groundbreaking insights from the largest CLL proteomics study to date, where Ti'ara L. Griffen and colleagues explored protein expression in 871 CLL and MSBL patients. Their findings unveil six prognostic proteomic signatures and a novel group linked to hairy cell leukemia, guiding personalized therapy and identifying new treatment targets.

00:00
00:00
~3 min • Beginner • English
Introduction
Chronic lymphocytic leukemia (CLL) is the most common adult leukemia and poses diagnostic and prognostic challenges. Traditional prognosticators include IGHV mutation status, cytogenetic aberrations (del(17p), del(11q), del(13q), trisomy 12), and expression of ZAP70 and CD38. Treatment has evolved from alkylators and purine analogs to rituximab-based chemoimmunotherapy (e.g., FCR) and, since 2014, targeted agents including BTK inhibitors (ibrutinib, acalabrutinib), PI3K inhibitors (idelalisib, duvelisib), and the BCL2 inhibitor venetoclax. Many patients are initially managed with Watch and Wait (WaW). The availability of highly effective targeted therapies and improved molecular understanding necessitates re-evaluating when to treat and how to select therapy. Traditional markers may be less informative in the modern era, and multiple co-occurring molecular events complicate therapy selection. The authors hypothesize that because genetic, epigenetic, and environmental influences converge at the protein level, proteomic analysis can integrate these effects to predict time to first/second treatment and survival, and to inform therapy selection.
Literature Review
The paper notes established prognostic markers in CLL (IGHV mutation status, cytogenetic abnormalities including del(17p), del(11q), del(13q), trisomy 12, ZAP70, CD38) and the historical progression of therapies from cytotoxics to rituximab-based regimens and modern targeted agents (BTK, PI3K, BCL2 inhibitors). Prior proteomic studies using mass spectrometry in small cohorts (N ~14–16) lacked sufficient size for classification and prognosis. The authors reference low correlation between mRNA expression and protein/PTM levels across cancers, underscoring the need for proteomics to capture activation states and pathway utilization.
Methodology
Study design and cohort: 871 samples from patients with CLL (n=795) and related mature small B-cell neoplasms/leukemias (MSBL; including HCL, HCL-V, MCL, MZL, PLL/PL, Richter’s, T-cell LGL) were collected at MD Anderson Cancer Center (IRB-approved protocols; informed consent) between 2005 and 2019. Samples were peripheral blood and bone marrow, fresh (n=127) or frozen (n=744). Clinical annotations included demographics, stage (Binet, Rai), laboratory indices (e.g., B2M), immunophenotype (CD19, CD20, CD22, CD23, CD38, CD79b), IGHV status, ZAP70, cytogenetics (del(11q), del(13q), trisomy 12, del(17p)), and outcomes (overall survival [OS], time to first treatment [TTFT], time to second treatment [TTST]). Proteomic profiling (RPPA): Reverse-phase protein arrays were generated from CD19+ B cells (normal controls from five donors) and patient cells. Five serial two-fold dilutions per sample were printed. Slides were probed with 384 singly validated antibodies (including 82 against post-translationally modified targets; total and phospho/cleaved forms). Signal was quantified using the SuperCurve R package on log2 scale; slides with technical issues were excluded. Loading control and spatial/geographical normalization were applied across slides. Expression was normalized to the median of control proteins and to normal CD19+ B-cell controls to gauge over/under-expression. Quality control and analytic framework: Data were analyzed at three levels: (1) individual proteins, (2) protein functional groups (PFGs), and (3) systems-level signatures. Antibodies were grouped into PFGs based on literature-defined function and expression correlation. For each PFG, unsupervised k-means clustering with a gap statistic-based algorithm (ProgenyClust) identified distinct patient expression patterns (clusters). Linear discriminant analysis and principal component analysis compared patient clusters to normal CD19+ controls. Patients were assigned to one cluster per PFG; the set of all PFG cluster memberships (150 expression patterns) underwent hierarchical block clustering to identify co-occurring PFG patterns (constellations) and recurrent patient signatures. Optimal numbers were determined algorithmically, yielding 13 constellations and 16 signatures, further grouped into six signature groups (SG A–F) based on constellation similarity and clinical outcomes. Statistics: Associations between proteomic classifications and clinical variables used chi-square/Fisher exact tests (categorical) and Kruskal–Wallis (continuous), with normality and variance assessed by Shapiro–Wilk and Levene tests. Outcomes were analyzed by Kaplan–Meier with log-rank tests; multiple testing was FDR-corrected. Cox proportional hazards models (with iterative train/test splits; relative risk, elastic net, ensemble learning) evaluated prognostic impact and validated findings. Multivariate models included SG membership and standard prognostic factors (IGHV, cytogenetics, stage, ZAP70, B2M, etc.). IGHV, CLL-IPI, and mutation subgroup analyses assessed additivity/complementarity of proteomic prognostication. Differential expression versus CD19 controls used one-way ANOVA with Tukey HSD. Random forests identified a minimal protein classifier set to predict SG membership.
Key Findings
- Proteome landscape: Of 384 proteins (including 82 PTMs) measured across 871 samples, 16% showed universally absent/very low expression; only six proteins were universally high (LEF1, PXN, ZAP70, CD24, S100A4, PCDC4). Matched blood vs marrow samples showed no significant differences. - Individual protein prognosticators: Numerous proteins were prognostic beyond FDR thresholds. For OS: 59 (median split), 78 (tertile), 79 (sextile), 130 (continuous) significant; for TTFT: 52 (median), 56 (tertile), 45 (sextile), 61 (continuous); for TTST: 11 (median), 18 (tertile), 1 (continuous). Validation via repeated Cox modeling found many proteins prognostic in >70% of test sets (e.g., 42 for OS). Novel prognostic proteins included SOD1 (implicating ROS scavenging). In BTK inhibitor-treated patients (n=108), multiple proteins were OS-prognostic (e.g., MAFZ, NUMB across all stratifications). - PFG-level prognostication: 78% of PFGs were prognostic for OS, 45% for TTFT, and 48% for TTST (FDR p<0.05 often). Apoptosis, apoptosis-regulation, heat shock, histone marks/modifiers, and signal transduction pathway (STP) regulation PFGs were prognostic for all endpoints. Example: MAPK PFG cluster with lower activating phosphorylation had adverse prognosis versus clusters with high phosphorylation. - Systems-level signatures: Unbiased clustering identified 16 signatures aggregated into six signature groups (SG A–F). SG membership associated with diagnosis mix, clinical indices, staging, IGHV status, ZAP70, and cytogenetics. SG-A contained a high proportion of non-CLL MSBL (52%); other SGs were predominantly CLL (86–99%). Unmutated IGHV was overrepresented in SG-C (68%). Historically adverse del(11q)/del(17p) were less common in SG-A/B/D/E and overrepresented in SG-C. - Clinical outcomes by SG: SG-A and SG-C had inferior OS and earlier TTFT compared with other SGs (overall p<0.0001). Multivariate analysis showed SG membership was an independent predictor of OS, outperforming Rai stage, IGHV, and cytogenetics. Within Rai 0–1, SG membership further stratified prognosis; conversely, Rai stage did not add prognostic value within SGs. SG prognostication was additive within IGHV, CLL-IPI, and mutation subgroups (e.g., del(17p), del(11q)). In multivariate OS models (n~773, 85 events), SG B–F had significantly lower hazards vs SG-A (e.g., SG-F HR 0.11, 95% CI 0.05–0.25; p≈1.0E-08). ZAP70 positivity and B2M were also independently associated with OS. - Therapy-class stratification: PFGs remained prognostic within therapy classes. Number of PFGs prognostic for OS: BTK inhibitor regimens (25), chemotherapy/chemoimmunotherapy (18), antibody-only (3). Metabolic Glucose and BCR PFGs showed survival disparities under BTK inhibitors and chemoimmunotherapy. Unfavorable BTK clusters in Metabolic Glucose PFG shared low PRKAA1/2 phosphorylation, suggesting therapeutic relevance. - Unique subset (SG-A/HCPLC): Identified a small CLL subset (≈5% of CLL) with HCL-like proteomics but poor responses to chemotherapy and BTK inhibitors. SG-A showed low B2M, near-normal hemoglobin/platelets, fewer adverse cytogenetics, and lower BCL2/CD19 expression; defined by specific constellations [1–3]. Histologically/phenotypically CLL, but clinically therapy-refractory. - Minimal classifier: A 30-protein random forest classifier distinguished SGs with an overall error rate ~18.6%; accuracy was highest for critical calls (e.g., SG-A 77%, SG-C 87.5% in current prototypes). Misclassification patterns identified an extremely poor prognosis cohort among SG-A/SG-D. - Watch-and-wait markers: PFGs such as GPCR, SMAD, and STP-regulation associated with TTFT. Individual proteins ANXA1 (low), TFRC (low), and SMAD2-p245 (high) were individually prognostic and in combination markedly stratified TTFT (median 1.59 years if ≤1 unfavorable marker vs 5.67 years if 2–3; p<0.00001). - Therapeutic targets: Differential expression versus normal B cells highlighted SG- and pan-SG targets. Universally overexpressed/activated proteins proposed as targets include CHK1 (pS345), WEE1 (pS642), GAB2, IGFBP2, S100A4, VCL (pS464), ZAP70. SG-specific overexpressed proteins (e.g., AKT1, ANXA1, SMAD2 in select SGs) suggest SG-tailored therapeutic strategies.
Discussion
The study demonstrates that proteomic profiling integrates the cumulative effects of genetic, epigenetic, and environmental factors on CLL biology and provides robust prognostic information at multiple levels—individual proteins, functional groups, and systems-level signatures. Proteomic signature groups (SGs) stratify OS, TTFT, and TTST beyond traditional markers (Rai/Binet stage, IGHV, cytogenetics) and retain prognostic value within therapy classes, including modern BTK inhibitor regimens. Identification of SG-A (HCPLC) reveals a previously unrecognized, small CLL subset with HCL-like proteomics but resistance to standard modalities, highlighting the potential of proteomics to uncover clinically meaningful subtypes not captured by conventional diagnostics. PFG- and protein-level analyses nominate actionable targets and suggest that protein activation states (e.g., phosphorylation) may guide targeted therapy choices. Proteomic classifiers could refine watch-and-wait decisions by predicting shorter TTFT, enabling early intervention for high-risk patients and avoiding overtreatment for indolent disease. Overall, proteomics supersedes traditional staging for survival prediction and complements established molecular classifiers for treatment timing.
Conclusion
In the largest CLL proteomics study to date, RPPA-based profiling of 384 total/PTM proteins across 871 patients identified six recurrent proteomic signature groups that robustly prognosticate OS, TTFT, and TTST and inform therapy selection across historical and modern regimens. A novel, small subset (SG-A/HCPLC) exhibits HCL-like proteomics with poor responses to chemotherapy and BTK inhibitors, underscoring the clinical utility of proteomic classification. A minimal 30-protein classifier enables prospective SG assignment, and differential expression analyses nominate pan- and SG-specific therapeutic targets (e.g., CHK1, WEE1, GAB2, IGFBP2, S100A4, ZAP70). Future work should validate these findings in contemporaneous cohorts uniformly treated with modern therapies, refine and clinically deploy the classifier (e.g., as a diagnostic kit), and prospectively test SG- and PFG-informed treatment strategies, including combinations targeting activated proteins and metabolism, and evaluate early intervention in groups with predicted short TTFT.
Limitations
Protein selection was biased to targets available/validated for RPPA; some biologically relevant proteins/pathways may be underrepresented. The number of patients treated with modern targeted therapies (e.g., BTK inhibitors) was relatively small, limiting power for therapy-response analyses. Experimental mechanistic validation of proteomic targets and resistance biology was not performed. The study included heterogeneous MSBL diagnoses to define patterns, which may introduce complexity, and some analyses were constrained by small subgroup sizes.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny