Environmental Studies and Forestry
A computational approach to analyzing climate strategies of cities pledging net zero
S. Sachdeva, A. Hsu, et al.
Discover how a cutting-edge analysis of 318 climate action documents reveals the secrets behind ambitious net-zero targets. Conducted by Siddharth Sachdeva, Angel Hsu, Ian French, and Elwin Lim, this research unveils the dominant role of energy actions while highlighting critical trade-offs in climate action themes. Join us to explore the innovative methodology that makes this study both replicable and scalable.
~3 min • Beginner • English
Introduction
The study examines how cities committing to net-zero emissions articulate strategies to meet these goals, amid heterogeneity in definitions, scopes, and documentation of city climate plans. With over 1,000 cities announcing net-zero intentions since 2018 and significant public scrutiny over credibility and potential greenwashing, the research asks: (1) which language patterns in climate strategy documents predict ambitious, economy-wide net-zero targets; and (2) what sectoral themes and trade-offs characterize city climate strategies globally. The context includes varied nomenclature (carbon neutral, zero emissions, climate positive), differing scopes (sectoral focus, direct vs. consumption-based emissions), and variable integration of mitigation and adaptation. Prior large-scale analyses were largely Europe-centric, with limited global coverage. This study aims to provide a systematic, scalable analysis of climate plans from cities with net-zero pledges to reveal patterns, credibility signals, and potential gaps in action.
Literature Review
The literature highlights rapid growth in subnational net-zero targets post-Paris Agreement recognition of multi-level climate action. Reports indicate many cities have pledges, but fewer have robust targets aligned with Race to Zero ‘starting line’ criteria. Corporate pledges frequently underdeliver, intensifying concerns over credibility. European-focused studies (Reckien et al.; Salvia et al.) have analyzed hundreds of city plans, noting varied emphasis on mitigation vs. adaptation, and some linkage between the two. Global analyses outside Europe are limited, with studies often focusing on single cities, countries, or sectors (e.g., buildings). The lack of standardized definitions and boundaries (e.g., Scope 3/consumption-based emissions) complicates comparison and impact projection. Evidence suggests many cities prioritize mitigation in energy-related sectors, potentially overlooking consumption-based emissions that can exceed production-based inventories. Integration of spatial planning and cross-sectoral synergies is often insufficient, with mitigation and adaptation treated separately. These gaps motivate a scalable text-based approach to assess strategy content across heterogeneous documents.
Methodology
Design: The study applies machine learning-based NLP to analyze 318 city climate strategy/policy documents from cities with net-zero pledges, aiming to (1) predict language patterns associated with ambitious, economy-wide net-zero targets and (2) identify sectoral themes and trade-offs.
Data collection: From 823 cities recorded by Data-Driven EnviroLab and NewClimate Institute (as of May 2020) with some net-zero pledge, publicly available climate strategy documents were gathered (May–July 2020) via web searches combining city names with keywords (e.g., climate action plan, net-zero plan, carbon neutral plan). Multiple documents per city were concatenated into a single corpus. Documents under 50 characters were excluded, leaving 318 cities, predominantly in North America (n=78) and Europe (n=202).
Preprocessing: Documents were OCR-converted to text using pytesseract (v0.3.8); non-English texts were translated to English via Google Translate API. Text was tokenized, lemmatized, and stopwords (including punctuation, numbers, proper nouns, pronouns) were removed. Features: tf-idf representations using 1-grams and 2-grams were created, filtering out terms appearing in less than 10% of documents, and normalizing counts by corpus-wide term frequency.
Ambitious target labeling: Cities were coded as pledging an economy-wide net-zero target if they (a) joined UNFCCC Race to Zero or (b) explicitly committed to carbon neutrality/net-zero or to at least an 80% emissions reduction target economy-wide by mid-century. Government-operations-only net-zero targets were excluded. Of 318 cities, 242 were labeled as having economy-wide net-zero targets.
Predictive modeling: A logistic regression model (scikit-learn) with L1 regularization (lasso) was trained on tf-idf features to predict ambitious target status (binary). L1 regularization selected a sparse, interpretable set of predictive terms and mitigated overfitting in the high-dimensional feature space. A class reweighting factor corrected for label imbalance (76% positive class). Model evaluation used 50 repeated train-test refits with leave-one-out cross-validation on test folds; metrics (accuracy, F1, precision, recall) were averaged across runs. Coefficients were averaged across refits for robustness, and p-values for term significance were computed using chi-square tests of independence.
Topic analysis: A key term-based approach defined nine topics based on common urban GHG sources and climate themes: Land use, Industry, Buildings, Transportation, Electricity, Heating, Waste/Pollution, Climate Impacts, Offsets. Seed lexicons per topic were expanded via word2vec similarity (spaCy) and manual curation (dictionary-based method). For each plan, counts of topic-associated terms produced a nine-dimensional topic vector. Descriptive statistics (median topic counts) characterized topic prevalence. Factor analysis reduced topics into latent factors representing broader themes. Cities were positioned on factor axes to assess trade-offs and clusters.
City characteristics and statistical tests: Associations between factor scores and city attributes (area, population, per-capita emissions from EU Covenant of Mayors/CDP, Köppen-Geiger climate zones, World Bank regions) were assessed using one-sided Kruskal-Wallis ANOVA.
Software and resources: R (v3.6.2) and Python libraries (scikit-learn, spaCy) were used. Data and code availability are provided via UNC Dataverse and GitHub.
Key Findings
- Sample and descriptive stats: 318 cities’ documents analyzed. Of these, 242/318 pledged economy-wide net-zero targets. Summary statistics (Table 1): mean percent reduction target 70.38% (SD 40.21%, N=298); mean baseline year ~2005.54 (N=273); mean per-capita emissions 6.37 tCO2/person (SD 5.06, N=255).
- Predictors of ambitious economy-wide net-zero targets: Logistic regression identified significantly different (p < 0.01) language-use patterns for cities with ambitious targets. Four predictive term themes:
1) Specific, quantitative metrics: terms such as percent, year, square, GHG reduction, life cycle, emissions generate, opportunity reduce—indicating quantified targets, reference/baseline years, space-based metrics, life-cycle considerations, and defined scopes.
2) Emissions sources: building/heating-related terms such as wood, CHP (combined heat and power), passive (design), boiler—signaling sector-specific mitigation strategies.
3) Governance: mayor, fees, energy plan—indicating leadership buy-in, policy instruments (fees), and alignment with broader energy plans.
4) Human-centered approaches: inclusive, advocate—emphasizing community engagement, equity, and advocacy.
- Topic prevalence: Energy-related themes dominate city plans. Median topic counts show Electricity, Buildings, and Transportation as most frequent; less frequent topics include Land-use, Heating, Waste/Pollution, Industry, Climate Impacts, and Offsets.
- Trade-offs among topics: Energy emphasis correlates negatively with Land-use (r = -0.44) and with Climate impacts (reported r = 0.4, described as negative association). Buildings are negatively associated with Climate impacts (r = -0.36). Pollution/Waste co-occurs with Land-use (r = 0.54), and Energy with Buildings (r = 0.38).
- Latent factors: Two dominant factors emerged—“Infrastructure” (heating, buildings, energy, transportation topics) and “Ecology” (pollution/waste, land-use, impacts). Trade-offs observed between factors; median city focuses more on Infrastructure. Quadrant analysis highlights city archetypes: e.g., London and Philadelphia (Infrastructure-heavy), New Bedford and Vancouver (Ecology-heavy), San Francisco and Tokyo (balanced), with examples illustrating sectoral emphases and plan detail depth.
- Geographic/climatic patterns: Significant differences in Ecology factor scores by climate zone and region (p < 0.001). Tropical cities more Ecology-focused; Arid cities more Infrastructure-focused; Cold and Temperate cities more similar. European cities tend to de-emphasize Ecology on average; Infrastructure emphasis varies by region, with Eastern Europe and Central Asia exhibiting greatest diversity.
- Scope 3 coverage: Fewer than 10% of analyzed cities explicitly mention “Scope 3” emissions in their plans, indicating a gap in addressing consumption-based emissions.
Discussion
The findings show that language specificity (quantitative metrics, sector-specific strategies) and governance signals (mayoral leadership, policy instruments, energy planning) are strong indicators of ambitious, economy-wide net-zero commitments. This suggests that credible plans likely pair clear targets with implementation mechanisms and community engagement. However, a dominant focus on energy-related sectors risks overlooking critical mitigation opportunities, notably consumption-based (Scope 3) emissions, which in other studies can exceed production-based emissions. The observed trade-offs—Infrastructure topics often displacing Ecology (land-use, impacts)—indicate siloed planning that may miss synergies between mitigation and adaptation, as well as opportunities from spatial planning to reduce transport and energy demand. Climatic and regional context shapes strategic emphases, implying that tailored capacity-building and policy guidance could help cities balance infrastructure upgrades with ecological and resilience considerations. The NLP-based, scalable approach enables systematic comparison across heterogeneous documents, offering a starting point for accountability, benchmarking, and cross-city learning, though it cannot alone establish causal links between targets and performance.
Conclusion
This study introduces a scalable, replicable NLP approach to systematically analyze heterogeneous city climate action plans, identifying language features predictive of ambitious, economy-wide net-zero targets and revealing common sectoral emphases and trade-offs. Key contributions include: (1) demonstrating that specificity in metrics and governance language are associated with ambitious pledges; (2) mapping dominant energy/buildings/transportation focus versus underemphasized land-use and climate impacts; and (3) distilling strategies into Ecology and Infrastructure factors with clear trade-offs and climate/region-related patterns. Future research should (a) expand global coverage as more cities publish plans, especially in the Global South; (b) integrate outcomes data to link plan content with implementation and emissions performance; (c) assess inclusion of consumption-based emissions and offsets; and (d) explore mechanisms to better integrate mitigation, adaptation, and spatial planning to capture cross-sectoral synergies.
Limitations
- Geographic and sample bias: Overrepresentation of Global North cities due to availability of published plans; sample (318/823) is not globally representative and excludes cities without public documents.
- Temporal limitations: Rapidly evolving landscape; documents collected mid-2020 and may not reflect subsequent plan updates or newer net-zero commitments.
- Content vs. outcomes: Analysis focuses on plan content, not implementation or emissions outcomes; cannot predict achievement of targets.
- Document heterogeneity: Varied formats, lengths, and terminologies; although NLP preprocessing and translation were applied, residual inconsistencies may remain.
- Scope limitations: Government-operations-only targets excluded from “economy-wide” classification; limited explicit coverage of Scope 3 in plans constrains assessment of consumption-based strategies.
Related Publications
Explore these studies to deepen your understanding of the subject.

