
Health and Fitness
Spatial consistency of co-exposure to air and surface water pollution and cancer in China
J. Jiang, L. Zhang, et al.
This groundbreaking study reveals how air and surface water pollution are intertwined, contributing to nearly 63,000 new cancer cases in China in 2016 alone. Conducted by Jingmei Jiang and colleagues, the research underscores the urgent need for coordinated environmental and health policies.
~3 min • Beginner • English
Introduction
The study addresses a central question: whether air and surface water environments are spatially connected and whether multiple cancer types cluster in areas with poor environmental conditions arising from co-exposure to pollutants. Rapid urbanization and industrialization have led to widespread environmental deterioration, with over 90% of the world’s population living in areas exceeding WHO PM2.5 standards. Evidence links PM2.5 to lung cancer and NO2 to breast cancer, while surface water is also transboundary with multiple exposure routes. Prior work has largely assessed individual pollutants or local water pollution impacts and observed overlapping spatial patterns of cancer types, suggesting potential common environmental causes. However, existing evidence is fragmented, lacking a comprehensive framework to evaluate the holistic environment–cancer relationship. The authors hypothesize that air and surface water conditions are spatially correlated and that cancers tend to cluster where these environmental conditions are poorer. To test this, they create a unified Spatial Evaluation System for Environment and Cancer (SESEC) and a common graded scale to measure co-pollution across China.
Literature Review
Background literature indicates strong links between fine particulate matter (PM2.5) and lung cancer, with meta-analytic support, and growing evidence implicating nitrogen dioxide (NO2) in breast cancer and possibly other cancers. Ozone and other air pollutants have been increasingly monitored. Surface water pollution studies have often been local, focusing on specific contaminants, with evidence that river water quality has been improving in China since 2003 but remains heterogeneous. Spatial analyses have noted clustering and overlapping distributions of multiple cancer types across regions, implying shared environmental risks. However, prior investigations have typically examined single pollutants or single media, using indices (AQI, WQI) that rely on single-factor evaluation and ignore spatial relationships between monitoring sites. This fragmentation underscores the need for a comprehensive, spatially explicit, multi-pollutant, multi-cancer evaluation system.
Methodology
Study design and data integration: The authors constructed the Spatial Evaluation System for Environment and Cancer (SESEC) by harmonizing national monitoring data for air quality (2015, CNEMC), surface water quality (monthly, 2021, Environmental Quality Monitoring Network), and cancer incidence (2016, China Cancer Registry Annual Report). The basic spatial unit was the prefecture-level area; an analysis unit required the presence of all three components (air monitoring site, surface water monitoring section, and a cancer registry institute). This yielded 219 non-overlapping analysis units covering 377 million people across mainland China. For pollutants, averages across multiple points within a unit were computed to represent unit-level exposure.
Environmental and cancer items: Air pollutants included six WHO-guideline pollutants (PM2.5, PM10, NO2, O3, SO2, CO). Surface water included 13 organic/quality indicators: volatile phenol, sulphate, fluoride (F), anionic surfactant (AS), total nitrogen (TN), ammonia nitrogen (NH3-N), permanganate index (COD_Mn), chemical oxygen demand (COD), biochemical oxygen demand (BOD5), total phosphorus (TP), petroleum, dissolved oxygen (DO), and cyanide. Thirteen cancer types were analyzed due to high incidence, increasing trends, or low survival: oesophageal, stomach, colorectal, liver, gallbladder, pancreatic, lung, bone, breast, kidney, brain cancers, leukemia, and lymphoma.
Thresholds and temporal scales: Air thresholds used China’s National Ambient Air Quality Standards annual limits (PM2.5: 35 µg/m3; PM10: 70 µg/m3; NO2: 40 µg/m3; SO2: 60 µg/m3). For CO and O3, thresholds were set using the Bulletin on the State of China’s Environment (2015): CO 96.7th percentile, O3 84th percentile. Surface water thresholds were defined as the national 75th percentile for each indicator due to the lack of unified health-based standards. Annual averages (or peak season for O3) were used to smooth temporal variability. For water measurements below LOD, 1/2 LOD was imputed; missing monthly values (1.8–9.2% per month) were averaged over the year.
Spatial clustering and pollution grading: A modified local Moran’s I for binary variables (high vs. not high relative to the pollutant-specific threshold) identified six spatial patterns: HH (high-high cluster), LL, HL (high-low outlier), LH, HN (high-not clustered), LN. For each unit, pollutants classified as HH, HL, or HN were counted as high-level (“H”). Air pollution level was graded by the count of H pollutants: level 1 (0–1), level 2 (2), level 3 (3–6). Water pollution level: level 1 (0–1), level 2 (2–5), level 3 (6–13). Cross-tabulating air and water levels (3×3) yielded a co-pollution graded scale: Grade I (1–1), Grade II (1–2 or 2–1), Grade III (1–3, 3–1, or 2–2), Grade IV (2–3, 3–2, or 3–3). Sensitivity analyses considered alternative grouping schemes meeting order consistency, sufficient interval, and group size criteria.
Statistical analysis of cancer associations: The team employed a mixed modeling strategy combining machine learning and classical statistics. They used SHAP (Shapley Additive Explanations) to rank pollutant contributions to each cancer type and negative binomial regression to estimate rate ratios (RRs) for cancer incidence versus pollutant concentrations, adjusting for co-pollutants and social factors. Social covariates from China Statistical Yearbook 2020 included per capita GDP, fraction of population aged ≥65 years, and urbanization rate. Spatial autocorrelations and cross-media correlations were quantified (Spearman’s coefficients). Population attributable fractions (PAF) and excess cases were estimated by integrating pollutant-specific RRs across observed concentration distributions within each co-pollution grade, bounding RRs at the threshold concentration to obtain minimum PAFs. Combined PAFs for co-pollution were calculated using standard multi-risk formulas.
Software: ArcGIS 10.8 was used for spatial analyses and mapping; SAS 9.4, R 4.2.1, and Python 3.10 (SHAP package) were used for modeling.
Key Findings
Spatial heterogeneity and correlations: Marked spatial heterogeneity was observed across all 19 environmental indicators and 13 cancer types. Strong within-media correlations were evident: up to 0.88 between PM2.5 and PM10 (air) and 0.75 between COD_Mn and TP (water). Cross-media correlations were also present, e.g., 0.50 between PM10 and TN. Cancer incidences correlated across sites (up to 0.72 for breast–kidney). Pollutant–cancer correlations included 0.44 for PM10–oesophageal and 0.34 for COD_Mn–lung.
Co-pollution grading distribution: Of 219 units, 78 (35.6%) were Grade IV (high air and water pollution), concentrated in Beijing–Tianjin–Hebei, Huaihe River basin, and the Fen-Wei Plain. In Grade IV, all 19 pollutants exceeded thresholds somewhere, with PM2.5 and PM10 exposure rates near 100%. Thirty-two units (14.6%) were Grade I (low in both; southern China), with only seven pollutants exceeding thresholds and a maximum exposure rate of 59.4% for PM2.5. Grade II and III accounted for 65 (29.7%) and 44 (20.1%) units, respectively. Very few areas exhibited discordant pollution (air high/water low: 11 units, 5.0%; water high/air low: 4 units, 1.8%), reinforcing spatial linkage between media.
Cancer–environment concordance and dose response: Spatial distributions of cancer incidence aligned with co-pollution grades, notably for lung, stomach, and oesophageal cancers. Grade IV areas had the highest incidence for seven cancers, with estimated RRs versus Grade I: oesophageal 2.502, gallbladder 1.790, pancreatic 1.686, kidney 1.639, stomach 1.468, breast 1.374, and lung 1.289. A dose-response pattern was evident: as co-pollution grade increased, both the number of cancer types with elevated risk and the magnitude of total cancer incidence rose; findings were robust to alternative grading schemes.
Pollutant-specific associations: SHAP identified all 19 pollutants as potentially relevant to at least one cancer. Eight pollutants showed significant positive effects across models: PM10, PM2.5, NO2, O3 (air) and COD_Mn, petroleum, DO, cyanide (water). NO2 was associated with increased risk across nine cancers (e.g., colorectal RR≈1.132, gallbladder 1.102, pancreatic 1.172, lung 1.042, breast 1.119, kidney 1.126, brain 1.056, leukemia 1.099, lymphoma 1.233). PM2.5 was causally linked to lung cancer (RR=1.188) and suggested an association with leukemia (RR=1.298). COD_Mn related to pancreatic (RR=1.089), breast (1.274), and kidney (1.177) cancers, indicating its utility as a proxy for nitrite/organic pollution when testing capacity is limited.
Burden estimates: An estimated 62,847 excess cancer cases in 2016 (7.4% of total) were attributable to air and surface water pollution across the 219 units. The number of attributable pollutants and cancer types increased with grade (from 3 pollutants/5 cancer types in Grade I to 8 pollutants/10 cancer types in Grade IV). Grade IV areas accounted for 43,827 excess cases (69.7% of total). Patterns reflected dominant exposures by grade (e.g., PM2.5 in Grade I explaining ~523 lung cancer excess cases; increasing NO2 exposure from 0% in Grade I to 22.7% in Grade III coincided with higher excess colorectal and breast cancers).
Discussion
The findings support the hypothesis that air and surface water pollution are spatially linked and that multiple cancers cluster in areas with poor environmental conditions. By developing SESEC and a common co-pollution grading scale, the study transforms complex multi-media, multi-pollutant exposures into an actionable, spatially explicit metric that correlates with cancer burdens. The observed dose-response relationships and pollutant-specific associations underscore that environments cannot be treated as separate entities; coordinated governance across environmental sectors is necessary. The work also shows that social determinants (GDP per capita, aging, urbanization) contribute to cancer patterns, and for liver cancer, social/infectious factors (e.g., hepatitis B/C) may outweigh natural environmental contributions after adjustment. The spatial analytic paradigm complements individual-level epidemiology, leveraging stable, large-scale environmental–health parameters to reveal patterns that can guide policy and intervention priorities, particularly in high co-pollution regions like Beijing–Tianjin–Hebei and the Huaihe basin.
Conclusion
This study introduces a national-scale Spatial Evaluation System for Environment and Cancer (SESEC) and a unified co-pollution grading system that jointly evaluate air and surface water pollution in relation to cancer incidence. It demonstrates spatial consistency between co-pollution grades and multiple cancer types, quantifies dose-response patterns, identifies key pollutant–cancer associations, and estimates that 7.4% (62,847) of 2016 incident cancers in the study area are attributable to these environmental exposures, with nearly 70% of excess cases arising in Grade IV regions. The results advocate for integrated, cross-sector environmental governance and targeted cancer prevention strategies tailored to regional co-pollution profiles. Future research should elucidate the network mechanisms and interactions among pollutants, refine exposure assessment (including temporally aligned data and individual-level metrics), and expand multi-disciplinary collaborations to strengthen causal inference and guide effective interventions.
Limitations
Key limitations include temporal misalignment of datasets (air 2015, water 2021, cancer 2016), which likely leads to conservative (underestimated) risk and burden estimates; lack of individual exposure quantification and uncertainties in exposure pathways, latency, and gene–environment interactions; reliance on administrative monitoring networks and aggregated spatial units, which may obscure within-unit heterogeneity; exclusion of surface water metal pollutants due to low national concentrations and strict controls; and potential selection effects as non-included units (without full data) had lower pollution and differing socio-environmental profiles. For water data, some monthly values were missing (1.8–9.2%) and imputed via annual averaging; values below detection were set at 1/2 LOD. Despite these constraints, sensitivity analyses supported the robustness of spatial and dose-response findings.
Related Publications
Explore these studies to deepen your understanding of the subject.