logo
ResearchBunny Logo
Geographic variation of mutagenic exposures in kidney cancer genomes

Medicine and Health

Geographic variation of mutagenic exposures in kidney cancer genomes

S. Senkin, S. Moody, et al.

Discover how international mutations linked to kidney cancer vary from country to country in this groundbreaking study. Researchers from diverse institutions delved into 962 clear cell renal cell carcinoma samples, revealing surprising patterns of carcinogenic exposures. Explore the intriguing connections between tobacco consumption and cancer mutations while uncovering hidden links to environmental factors. Join the eminent authors in shedding light on this critical health issue affecting millions worldwide.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses why ccRCC incidence varies markedly across countries despite known risk factors (obesity, hypertension, tobacco) explaining less than half of the global burden and not accounting for geographic or temporal trends. The authors hypothesize that unidentified environmental or lifestyle mutagenic exposures contribute to ccRCC and can be detected via distinct mutational signatures in tumor genomes. Prior smaller ccRCC sequencing studies lacked geographic diversity and power to link risk factors with mutational processes. By profiling somatic mutations and signatures across diverse populations with different ccRCC incidence rates, the study aims to identify mutagenic exposures, evaluate mechanisms of known risk factors, and relate mutation processes to incidence patterns.
Literature Review
Mutational signatures provide a record of endogenous and exogenous mutational processes in cancer genomes and have been catalogued across tumor types (COSMIC signatures, including 71 SBS/DBS and 18 ID signatures, many with known etiologies). ccRCC incidence is highest in Central/Northern Europe and has risen in high-income countries. Known ccRCC risk factors (obesity, hypertension, smoking) explain under 50% of cases and do not capture geographic variation. Previous ccRCC genomic studies were limited in size and scope, often focusing on restricted geographies and without comprehensive signature–risk factor associations. Aristolochic acid (AA) exposure has been linked to BEN in the Balkans and to SBS22 in renal/urothelial and liver cancers in parts of Europe and Asia. Tobacco-related signatures (SBS4, DBS2) and other signatures (e.g., SBS1/5 aging, SBS18 oxidative damage, mismatch repair defects) have known etiologies, providing a framework to interpret findings. The study builds on this knowledge to uncover geographically variable exposures in ccRCC.
Methodology
Design and cohort: 962 primary ccRCC cases from 11 countries (Czech Republic 259, Russia 216, UK 115, Brazil 96, Canada 73, Serbia 69, Romania 64, Japan 36, Lithuania 16, Poland 13, Thailand 5) covering a wide range of age-standardized incidence rates (ASR). Inclusion: age ≥18, treatment-naïve, confirmed ccRCC; standardized epidemiological questionnaires captured sex, age at diagnosis, BMI, hypertension, smoking; detailed residential histories in Czech Republic, Romania, Serbia. Centralized pathology with ≥50% tumor cell content required. Biospecimens and sequencing: Tumor and matched normal (blood, except Japan used adjacent normal kidney; validation with blood showed spectra concordance >0.99) DNA extracted; whole-genome sequencing on Illumina NovaSeq 6000 (150 bp PE), target coverage ~54x tumor and ~31x normal; QC thresholds (tumor ≥30x, normal ≥15x; evenness MoM 0.92–1.09; contamination <3%). 962 cases passed. Variant calling and copy number: Copy number with ASCAT/BATTENBERG. SNVs with CaVEMan (plus Strelka2 consensus), indels with Pindel (plus Strelka2), structural variants with BRASS. Additional filters applied to reduce artifacts; only variants called by both pipelines retained. Mutational matrices/signature extraction: SBS/DBS/ID matrices via SigProfilerMatrixGenerator. De novo signature extraction using SigProfilerExtractor (NMF with Poisson resampling; SBS-1536 and SBS-288 contexts; DBS-78; ID-83) and mSigHdp (Bayesian HDP) with country as hierarchy. Decomposition to COSMIC v3.3 where possible; identification of novel signatures if not captured by COSMIC. Signature attribution: MSA signature attribution tool (NNLS with iterative penalty optimization; L2 similarity thresholds) used to assign activities per sample and derive 95% CIs via parametric bootstrap. Presence defined by CI >0. Driver mutations: dNdS (global and in 369 cancer gene panel) to identify positively selected genes; driver annotation via Cancer Gene Census, Cancer Genome Interpreter, and Mutation Mapper; criteria included truncating TSG mutations, known oncogenic alterations, and enrichment metrics. Evolutionary timing: Subclonal reconstruction with DPClust on tumors with purity ≥40%; mutations split into clonal vs. subclonal; signature activities compared between compartments using Wilcoxon signed-rank test with BH correction. Epidemiology/signature associations: Logistic and linear regressions, adjusting for age, sex, country, smoking; Bonferroni considered; association of signature burdens with country-specific ASR (GLOBOCAN) globally and within Czech regions. Metabolomics: Untargeted UHPLC-QTOF-MS plasma profiling in 901 subjects (all countries except Japan; some without samples) to identify features associated with signatures; preprocessing, recursive feature reduction, regression adjusting for sex, age, BMI, batch, acquisition order; permutation-based FDR and random forest importance. Targeted assays measured PFAS and cystatin C isoforms; associations with signatures via quasi-Poisson/logistic models. Geospatial analyses: Multi-membership mixed models incorporating full residential history and duration in each region (Czech Republic, Romania, Serbia) to estimate regional effects on signature attribution, adjusted for age/sex. Data/code availability: WGS data and metadata deposited in EGA (study EGAS00001003542); metabolomics in MetaboLights MTBLS9394; analysis code available at Mutographs RCC GitLab.
Key Findings
- Somatic mutation burdens varied significantly by country: SBS 803–45,376 (median 5,093), DBS 2–240 (median 53), indels 10–14,770 (median 695); Kruskal–Wallis P<2×10⁻²³ (SBS), P<2×10⁻⁴ (DBS), P<6×10⁻¹⁴ (indels). Romania showed notably higher burdens. PCA of SBS classes revealed clusters of Romanian/Serbian (AA exposure) and Japanese (SBS12) cases. - Identified COSMIC-like SBS signatures: SBS1 (5-mC deamination), SBS2/SBS13 (APOBEC), SBS4 (tobacco), SBS5 (clock-like), SBS12 (unknown), SBS18 (ROS), SBS21/SBS44 (MMR deficiency), SBS22a (aristolochic acid). Novel SBS: SBS40a, SBS40b, SBS40c (decomposition of SBS40), and two additional de novo SBS_H and SBS_I; novel AA-associated components included SBS22b (SBS_I-like, T>A-rich), DBS20, and ID23. - Aristolochic acid exposure: High prevalence of SBS22a/SBS22b in Romania (SBS22a 70%, SBS22b 75%), Serbia (23% and 48%), and Thailand (60% each; small n=5). Strong correlations among SBS22a, SBS22b, DBS20, ID23; exposure often at high mutation burdens; higher SBS22a/22b in Romania, including areas outside recognized BEN zones; only five cases resided within BEN areas. Suggests widespread AA exposure across parts of southeastern Europe. - Japan-specific signature: SBS12 present in 72% of Japanese ccRCC vs 2% elsewhere (P=4.7×10⁻⁷); replicated in independent Japanese cohorts (12/14, 85% and 46/61, 75%). SBS12 also enriched in Japanese hepatocellular carcinomas (P=3.8×10⁻¹⁵). SBS12 shows T>C substitutions with strong transcriptional strand bias (indicative of bulky DNA adducts, likely exogenous) but unknown agent. - Ubiquitous signature associated with incidence: SBS40b present across countries, with average country-specific mutation burdens positively associated with kidney cancer ASR (SBS40a P=0.0022; SBS40b P=5.1×10⁻⁷). Within Czech regions, SBS40b burden differed by regional incidence (P=0.011), highest in highest-risk region. Indel signatures ID5 and ID8 (together ~60% of indel burden) strongly associated with ASR (P=1.3×10⁻¹⁰ and 7.1×10⁻⁵), and correlated with SBS40b (Spearman r=0.79 and 0.74), consistent with a shared process. - Metabolomics associations: Cotinine and hydroxycotinine (nicotine metabolites) associated with SBS4 (P=1.9×10⁻⁷ and 2.9×10⁻¹⁰). TMAP (N,N,N-trimethyl-L-alanyl-L-proline betaine), a marker of reduced kidney function, associated with SBS40b (P=1.2×10⁻⁵); cystatin C and creatinine correlated with TMAP and showed positive associations with SBS40b (P=0.023 and 0.058), linking SBS40b to impaired renal function. - Driver mutations: 1,962 driver mutations in 136 genes detected; canonical ccRCC drivers (VHL, PBRM1, SETD2, BAP1) frequencies consistent across countries. AA-exposed tumors showed enrichment of T>A driver mutations vs unexposed (25% vs 13%, P=0.0062), including in VHL (30% vs 16%) and at exome-wide level (27% vs 12%), indicating proportional contribution of AA mutagenesis to drivers. No significant enrichment of T>C drivers for SBS12 (20% vs 12%, P=0.069). - Evolutionary timing: Exogenous-associated signatures (SBS12, SBS22b, SBS40b) were more active in clonal (early) than subclonal mutations (q=0.040, 0.022, and 2.3×10⁻⁵, respectively), consistent with exposure to normal cells before tumor expansion; SBS22a showed no significant difference. Endogenous processes (APOBEC/SBS13 and ROS/SBS18) enriched in subclones (q=1.6×10⁻⁴ and 3.2×10⁻⁷). - Risk factor associations: Total mutation burdens and several signatures (including SBS4, DBS2) increased with age; higher overall burdens in males than females, with SBS40b also higher in males (P=9.3×10⁻⁵). Tobacco smoking associated with SBS4 (P=5.3×10⁻⁶) and DBS2 (P=2.4×10⁻⁷), consistent with direct mutagenic exposure of the kidney. No mutational signature associations with obesity, hypertension or diabetes in observational or PRS analyses (except genetically inferred smoking with DBS2, P=0.01).
Discussion
Sequencing of 962 ccRCC tumors from 11 countries uncovered multiple mutational processes with strong geographic variation, indicating diverse environmental mutagenic exposures contributing to ccRCC. Aristolochic acid-related signatures (SBS22a/SBS22b/DBS20/ID23) were prevalent in Romania and Serbia, extending beyond recognized BEN areas and implying extensive exposure in southeastern Europe. A Japan-specific, exogenous-like signature (SBS12) was common in kidney and liver cancers in Japan, suggesting a population-wide exposure whose agent remains unknown. The pervasive SBS40b signature, largely confined to ccRCC, correlates with kidney cancer incidence across countries and within regions, and associates with biomarkers of impaired renal function, implying a ubiquitous exposure or renal state that increases mutational load and potentially contributes to incidence differences. These findings support a model in which both mutagenic and non-mutagenic carcinogenic mechanisms drive ccRCC risk. The presence of tobacco-associated signatures but absence of signatures linked to obesity and hypertension suggests that the latter act through non-mutagenic pathways, affecting clone expansion rather than mutation generation. Timing analyses indicate that the exogenous processes (SBS12, SBS22b, SBS40b) occur early in tumor evolution, consistent with exposures in normal renal cells preceding malignant transformation. Together, the results highlight the power of global cancer genomics to reveal hidden carcinogenic exposures and to guide public health investigations.
Conclusion
This study demonstrates substantial geographic variation in mutational signatures in ccRCC, revealing: (1) widespread aristolochic acid-associated mutagenesis in parts of southeastern Europe; (2) a Japan-enriched exogenous mutational process (SBS12) affecting kidney and liver cancers; and (3) a ubiquitous ccRCC-specific process (SBS40b) whose mutational burden correlates with kidney cancer incidence and renal dysfunction biomarkers. Tobacco signatures confirm direct mutagenic effects in the kidney, while obesity and hypertension likely contribute via non-mutagenic mechanisms. The work underscores the utility of large-scale, geographically diverse whole-genome sequencing to identify carcinogenic exposures that may affect tens of millions of people. Future directions include identifying sources and extent of aristolochic acid exposure in southeastern Europe; determining the agent underlying SBS12 and its geographic reach, including studies of Japanese migrants; elucidating the origin and biology of SBS40b, including whether it reflects a ubiquitous environmental exposure or renal metabolic state; and expanding global mutational signature studies across additional countries and cancer types, including sequencing of normal tissues to map exposures at population scale.
Limitations
- Sample representation: Some countries had small sample sizes (e.g., Thailand n=5), limiting generalizability for those regions. - Exposure inference: Etiologies of SBS12 and SBS40b remain unknown; signature–exposure links are inferred from mutational patterns and geography, not direct exposure measurements. - Biospecimen constraints: Plasma metabolomics excluded Japanese cases, precluding metabolome–SBS12 association analysis; adjacent normal tissue used as match in Japan (validated with blood in subset). - Retrospective, multi-center data: Despite harmonization, differences in data collection protocols and potential unmeasured confounding may remain. - Ecological correlations: Associations with country/regional ASR are ecological and may not capture individual-level exposure–risk relationships.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny