Medicine and Health
The evolution of lung cancer and impact of subclonal selection in TRACERx
A. M. Frankell, M. Degasperi, et al.
This groundbreaking study by A M Frankell and colleagues delves into the complexities of lung adenocarcinoma, revealing significant insights into clonal expansion, subclonal selection, and their impact on disease-free survival and relapse. The research uncovers various evolutionary dependencies among cancer genes, highlighting the role of whole genome doubling and copy number instability in NSCLC progression.
~3 min • Beginner • English
Introduction
Lung cancer is the leading cause of cancer-related death worldwide, contributing 18% of cancer mortality and 11% of incidence. The mechanisms underlying its aggressive behaviour remain incompletely understood. Multiregion sequencing enables inference of tumour phylogeny from intratumour heterogeneity (ITH), but prior studies often included fewer than 100 patients, limiting statistical power, and the functional relevance of ITH to clinical outcomes has been debated. The TRACERx study is a prospective, multicentre cohort designed to track NSCLC evolution from diagnosis to recurrence, with co-primary endpoints assessing associations between ITH and clinical outcome, and adjuvant chemotherapy effects on ITH. Earlier analysis of the first 100 TRACERx patients showed pervasive genomic ITH and association of somatic copy number alteration (SCNA) heterogeneity with poor prognosis, without a clear link between mutational ITH and outcome. This study extends to the first 421 patients, leveraging multiregion exome data to interrogate tumour evolutionary dynamics, selection, smoking mutagenesis, WGD timing, subclonal expansions, and their relationships to disease-free survival (DFS) and relapse patterns.
Literature Review
Prior multiregion sequencing studies across cancers, including early TRACERx analyses, established branched evolution and extensive ITH but were limited by smaller cohorts. Earlier TRACERx (n=100) linked SCNA ITH to poor prognosis but not mutational ITH. The role of smoking-related mutational signatures (SBS4) has been characterized in lung cancer, with APOBEC-associated signatures (SBS2/13) implicated in ongoing mutagenesis. Debate persists regarding neutral versus selection-driven subclonal evolution in untreated tumours. This study situates its analyses within those findings, adding statistical power to detect gene- and pathway-level selection, timing of events, and evolutionary dependencies.
Methodology
Design: Prospective observational cohort (TRACERx; NCT01888601) of early-stage NSCLC patients eligible for primary surgery, with centralized ethics approval and informed consent. The cohort comprises the first 421 patients across 19 UK sites, representative of early-stage operable NSCLC demographics.
Sampling and sequencing: 1,644 tumour regions (1,554 at primary surgery; 90 during follow-up) passed QC. Whole-exome sequencing (WES) at median 413× (IQR 367–474). RNA libraries for fusions prepared from combined regional RNA using a bespoke Archer FusionPlex panel.
Histopathology: Central review confirmed subtype and growth patterns; staging per 7th TNM for analyses. Multiple primaries versus metastases adjudicated with WES-based clonal origin when possible.
Variant calling and signatures: Somatic SNVs/indels called via VarScan2 and MuTect with stringent filters; artefact control (e.g., SBS45). Mutational signatures de novo via hierarchical Dirichlet process and deconvolution using deconstructSigs (COSMIC v3.2), focusing on SBS1, SBS2, SBS4, SBS5, SBS13, SBS17b, SBS44, SBS92.
Copy number and WGD: SCNAs inferred via ASCAT with Sequenza support; multi-sample segmentation to classify loss/neutral/gain/amplification and LOH; mirrored subclonal allelic imbalance detected. WGD status per major allele copy-number thresholds across genome; subclonal and parallel WGDs inferred by ParallelGDDetect using mutation copy numbers and branch-specific doubling, benchmarked on realistic simulations.
Phylogenetics and clonality: Mutation clustering via extended PyClone with presence/absence pre-clustering and PhyloCCF estimation correcting for subclonal CNAs. Phylogenetic reconstruction using CONIPHER with error-correction and enumeration of alternative trees. Clusters classified as truncal versus non-truncal; per-region clonal/subclonal status assigned by PhyloCCF thresholds.
Selection and dependencies: Gene- and pathway-level selection estimated via dNdScv separately for truncal and subclonal mutations. Mutual exclusivity/co-occurrence (DISCOVER) and ordering relationships between truncal and subclonal events assessed, including drivers, SCNAs, signatures, and WGD.
Smoking history: Detailed patient-reported data converted to cigarettes/day and pack-years; never-, ex- and current-smoker categories defined; negative binomial GLMs used to model SBS4 counts versus clinical covariates and anatomical location.
ITH metrics and expansions: Mutational ITH as fraction of subclonal mutations; SCNA ITH as fraction of aberrant genome with heterogeneous events. Subclonal expansion characterization via illusion of clonality and a recent subclonal expansion score (maximum terminal-node PhyloCCF per tumour).
Outcomes: DFS analyses on 392 eligible patients (highest-stage tumour per patient used; collision/multiple primaries handled by maxima across tumours). Cox models (univariate/multivariable); time-varying hazard and restricted mean time lost (RMTL) analyses; models adjusted for stage, age, pack-years, histology, adjuvant therapy. Secondary analyses in relapsers for time-to-relapse and site (intrathoracic vs extrathoracic).
Key Findings
Cohort and phylogeny:
- 421 patients yielded 432 genomically independent tumours: 248 LUAD, 138 LUSC, 46 other NSCLC subtypes. Median age 69; stages I–III.
- WES-based clonal origin generally matched clinical assessments; revealed discordance in 6/74 (8%) pairs, potentially altering management.
- Collision tumours genomically identified in 3/421 patients (1%), with distinct KRAS mutations in different colliding LUADs.
- Phylogenies constructed for 401 tumours (1,428 regions); average 4.2 truncal and 2.8 subclonal driver mutations; 7% with pathogenic germline variants.
WGD and genome instability:
- ≥1 WGD detected in 307/401 tumours (77%). Subclonal WGD in 78/401 (19%). Multiple subclonal WGDs in parallel in 39/401 (10%). In 62% with parallel subclonal WGDs, regions reached similar ploidy but distinct WGD events were resolved by mutation copy numbers.
Smoking mutagenesis:
- Among ever-smoker LUADs, 161/215 (75%) showed smoking-mediated signatures (SBS4 or SBS92). In LUAD, truncal SBS4 increased with pack-years (r=0.31, P<0.001); in LUSC, truncal SBS92 increased with pack-years (r=0.32, P<0.001).
- 8% of LUADs in ever-smokers lacked SBS4/SBS92 (17/215), including some with >15 years of smoking; these were enriched for EGFR mutations (P=0.003; OR=11.7) and MET exon-14 skipping or RET/ROS1/ALK fusions (P=0.002; OR=15.6), resembling never-smoker tumours.
- Anatomical gradients in LUAD: more truncal SBS4-associated mutations on right versus left lung (rate ratio=1.63; P=0.0022) and upper/middle versus lower lobes (rate ratio=1.98; P<0.001).
Selection and evolutionary dependencies:
- Frequent subclonal positive selection in LUAD: 22/40 common genes under significant subclonal selection (e.g., TP53, KRAS, STK11, PIK3CA, RB1, SMARCA4). Some genes showed late-stage selection only (e.g., HIST1H1C, KMT2D, PTEN, RUNX1, SMAD4). In LUSC, 11/31 genes showed significant subclonal selection (e.g., ATM, B2M, KEAP1, NFE2L2, PIK3CA, SETD2).
- Timing differences by histology (e.g., B2M truncal selection in LUAD vs subclonal in LUSC), and pathway-level patterns (e.g., SWI–SNF and NOTCH components under subclonal selection in LUAD; RTK/MYC/NRF2 mainly truncal).
- Parallel evolution observed in drivers (e.g., B2M, SMARCA4, BAP1, KMT2D) and SCNAs (e.g., losses PTEN, B2M, SMAD2; gains MYC, PIK3CA, EGFR).
- Context dependencies: truncal mutual exclusivity between TP53 and KRAS (q<0.001) and TP53 and EGFR (q=0.031); KRAS truncal mutually exclusive with truncal SBS2/13 (q=0.001). Subclonal SBS1 mutually exclusive with SBS2/13 (q=0.008), subclonal WGD (q<0.001), and subclonal TERT/MYC amplification.
- Ordering: truncal TP53 decreased subsequent subclonal TP53 (q<0.001; OR=0.02) but increased subclonal APOBEC (SBS2/13) (q=0.013; OR=2.15), subclonal TERC amplification (q=0.012; OR=3.53), and subclonal WGD (q=0.015; OR=2.51). Truncal WGD decreased likelihood of subclonal TP53 (q=0.001; OR=0.18).
Subclonal expansions and prognosis:
- Subclones showing illusion of clonality exhibited stronger subclonal selection in LUAD (dN/dS=2.09; 95% CI 1.29–3.39) versus other subclones (dN/dS=1.33; 95% CI 1.00–1.76). LUSC had more illusions of clonality events than LUAD (P=0.0049).
- Recent subclonal expansions (terminal nodes) associated with low regional subclonal diversity and worse DFS: HR=1.70 (split by median; 95% CI 1.27–2.28) and HR=1.32 per 0.3 increase in expansion score (95% CI 1.12–1.55).
ITH and relapse timing/site:
- SCNA ITH associated with shorter DFS (HR=1.38; 95% CI 1.03–1.83). No significant association for mutational ITH (HR=0.85; 95% CI 0.64–1.13).
- High SCNA ITH enriched for early relapses (<1 year) (adjusted RMTL ratio at 12 months 2.23; 95% CI 1.39–3.56; P<0.001) and extrathoracic relapse (Fisher’s exact P=0.0083; OR=2.7).
- Subclonal WGD predicted shorter DFS: vs no WGD HR=1.63 (95% CI 1.08–2.47); vs truncal WGD HR=1.56 (95% CI 1.11–2.22). Truncal-only WGD similar to no WGD (HR=1.04; 95% CI 0.73–1.49). Subclonal WGD independent of SCNA ITH in multivariable model.
- In multivariable models including stage, age, pack-years, histology, adjuvant therapy, recent subclonal expansion score remained significant (HR=1.53 split by median; 95% CI 1.13–2.10; or HR=1.25 per 0.3 increase; 95% CI 1.06–1.50), while SCNA ITH and subclonal WGD were not. Among relapsers, SCNA ITH independently predicted shorter time to relapse (P=0.0063; coefficient −201 days; 95% CI −343 to −58) and extrathoracic site (P=0.0087; OR=3.17; 95% CI 1.36–7.73).
Discussion
This large multiregion WES cohort enables quantification of subclonal selection in early-stage, treatment-naive NSCLC at the gene and pathway level. Subclonal positive selection was frequent, sometimes exceeding truncal selection for specific genes/pathways, underscoring ongoing adaptive evolution beyond tumour initiation. Subclones achieving local dominance (illusion of clonality) bore stronger selection signals, and evidence of recent subclonal expansion in at least one region associated with worse DFS, supporting a model in which active clonal sweeps portend aggressive behaviour and greater metastatic potential.
The study delineates evolutionary dependencies: truncal TP53 disruptions predisposed to APOBEC activity, telomere-associated amplifications (TERC), and subsequent subclonal WGD. Conversely, truncal WGD reduced later TP53 hits, consistent with constraints after doubling. Parallel subclonal WGDs on different branches were relatively common and associated with copy-number instability.
Clinically, SCNA ITH best stratified early and extrathoracic relapse risk, while recent subclonal expansion offered added prognostic value for DFS. Subclonal WGD specifically (not truncal) further associated with poor DFS, highlighting the importance of timing of genome doubling.
A subset of LUADs in ever-smokers lacked tobacco mutational signatures yet were enriched for canonical never-smoker drivers (EGFR, RET/ROS1/ALK fusions, MET exon-14 skipping), suggesting tobacco-independent tumorigenic mechanisms in some exposed individuals and aligning with observations that early smoking cessation can mitigate mutational burden in bronchial epithelium.
Overall, integrating evolutionary metrics—SCNA ITH, subclonal WGD, and recent subclonal expansions—can improve prediction of relapse likelihood, timing, and site, with implications for postoperative risk stratification and surveillance.
Conclusion
By reconstructing tumour evolution across 1,644 regions from 421 early-stage NSCLC patients, this study demonstrates widespread subclonal positive selection, defines timing-dependent evolutionary dependencies among drivers, mutational processes, and WGD, and links specific evolutionary features to prognosis. Key clinical insights include: (1) SCNA ITH as a marker of early and extrathoracic relapse; (2) subclonal (but not truncal) WGD and recent subclonal expansions as predictors of shorter DFS; and (3) the presence of tobacco-signature-negative LUADs in ever-smokers enriched for never-smoker-like drivers. These findings argue for incorporating tumour evolutionary architecture into clinical risk models. Future work should validate these metrics in independent cohorts, explore integration with ctDNA monitoring, and investigate mechanisms driving tobacco-independent LUAD initiation and the biology of recent subclonal expansions and subclonal WGD.
Limitations
- Cohort represents early-stage, surgically resected NSCLC within a UK healthcare system; generalizability to advanced disease, non-surgical populations, or other regions may be limited.
- WES (not WGS) may under-detect certain structural variants and non-exonic mutations; copy number and signature inferences are constrained by exome coverage.
- Phylogenetic reconstruction and clonality estimates depend on modelling assumptions (e.g., mutation multiplicity, SCNA calling) and tumour purity; 401/432 tumours had trees reconstructed, leaving some unmodelled.
- Subclonal WGD detection relies on mutation copy-number doubling and regional ploidy thresholds; sensitivity is limited when WGD subclones are at low cancer cell fractions, and the method was conservatively limited to detecting up to two subclonal WGDs.
- For ancestral subclonal expansions, it is not possible to disentangle whether the expansion is driven by the ancestral clone itself versus hitchhiking descendants.
- Smoking signature classification uses thresholds and may misclassify tumours with low mutation counts or confounding mutational processes; SBS92 associations in LUAD were not robust.
- Observational design limits causal inference between evolutionary features and clinical outcomes; unmeasured confounders may remain.
Related Publications
Explore these studies to deepen your understanding of the subject.

