Business

Exploring the mechanism of path-creating strategy for latecomers: a combined approach of econometrics and causal machine learning

Y. Teng, Y. Li, et al.

This research by Yuanyang Teng, Yicun Li, and Xiaobo Wu explores the intricate relationship between latecomers' path-creating strategies and their performance in catching up technologically. Analyzing data from 283 high-tech manufacturing firms, the study uncovers how technological capabilities and the appropriability of innovation influence these strategies, offering valuable insights for strategic decision-making.

00:00

~3 min • Beginner • English

Index

Introduction

The study examines how latecomer firms can catch up with industry leaders by adopting a path-creating strategy, using both econometric hypothesis testing and causal machine learning to uncover average and heterogeneous effects. In the digital economy era, ML complements traditional econometrics by revealing complex, flexible patterns beyond predefined functional forms. From the sectoral innovation system perspective, latecomers can pursue catch-up via path-following, path-skipping, or path-creating strategies. While path-creating—developing along new technological paradigms—can enable nonlinear catch-up, it entails risks due to low or unstable initial productivity and uncertainty about technology standards. Existing research often yields case-based qualitative insights or average effects via linear models, limiting guidance on heterogeneous strategic choices. This paper adjusts the sectoral innovation system framework for latecomer catch-up, formulates hypotheses on the effects and boundary conditions of path-creating, constructs features across technology regime, market regime, and actors, and integrates econometrics with causal ML to test mechanisms, quantify treatment effects (ATE/CATE/ITE), and generate decision rules. The research question focuses on whether and how path-creating improves technological catch-up, through what mediating mechanisms (technological capability), and under what technological regime conditions (appropriability and cumulativeness) these effects strengthen or weaken.

Literature Review

The paper builds on the sectoral innovation system framework (Malerba, 2002, 2004, 2005) which highlights knowledge and technology regimes, market demand conditions, actors and their coordination, and institutions. Latecomer firms, disadvantaged initially in technology and markets, can leverage incumbent disadvantages and latecomer advantages to catch up (Hobday, 1995; Mathews, 2002, 2005). Prior work identifies three catch-up strategies: path-following, path-skipping, and path-creating (Lee & Lim, 2001; Lee et al., 2017). Path-creating—adopting new technological paradigms—often underpins leadership shifts (Lim et al., 2017; Malerba & Lee, 2021) but its effectiveness depends on technology regime features such as appropriability and cumulativeness. Appropriability affects diffusion and imitation; higher appropriability can hinder imitation-based catch-up but may protect latecomers’ own innovations under path-creating (Levin et al., 1987; Park & Lee, 2006; Chang et al., 2015). Cumulativeness typically favors incumbents by reinforcing knowledge accumulation and creating entry barriers (Breschi et al., 2000; Malerba, 2002; Rosiello & Maleki, 2021), though it can aid latecomers once they attain sufficient capability (Rho et al., 2015; Chang et al., 2015). The study articulates hypotheses: H1a: path-creating improves technological catch-up; H1b: technological capability mediates the effect; H2a/H2b: appropriability positively moderates path-creating → technological capability and technological capability → catch-up; H3: cumulativeness negatively moderates the direct effect of path-creating on catch-up. The literature also notes market regime effects (demand growth and fluctuation, industry concentration) and firm-level factors (R&D investment and staffing, complementary assets, absorptive capacity, technological diversity and originality, external technology dependence, firm age) as determinants of catch-up.

Methodology

Design: Mixed-methods combining econometric hypothesis testing and causal machine learning. Data and sample: Panel of 283 high-tech manufacturing firms listed in Shanghai/Shenzhen (China), 2007–2019. The period ensures accounting standard consistency (post-2007), availability of R&D disclosure, and excludes COVID-19 impacts. Data sources: Financials from CSMAR and Wind (cross-checked with listed firms’ annual reports via Jucao information network). Patent data from IncoPat (applications, types, IPC classes, citations). Resulting unbalanced panel: 1805 firm-year observations; for causal ML, resampled to 2252 to address class imbalance. Screening: Excluded ST-treated/delisted firms; handled missing data via matching, manual retrieval, mean imputation for small gaps; removed severe missingness and outliers. Variable construction: Dependent variable—Technological catch-up performance (TechCatchup): sum over IPC sectors of firm’s patent applications share of sector applications in year t (TechCatchup = Σ Pijt/Pit). Technology regime features: Appropriability—3-year average of industry self-citation share weighted by firm sector share; Cumulativeness—3-year rolling weighted ratio of persistent innovator patents to total in sector; Opportunity—3-year rolling sector patent growth; Uncertainty—dispersion of sector patenting ((Mi − Ni)/AVGi) aggregated by firm’s sector shares; Technology cycle time (TCT)—relative median age of backward citations; Industrial innovation degree (IIDegree)—industry’s share of annual patenting. Market regime features: Demand growth (industry demand growth), Demand fluctuation (std. dev. of demand), Industry concentration (Herfindahl index; lower implies more competition). Actors/firm features: Path-creating—firm self-citation share of received citations in year t (0 indicates none); Technological capability (TechCapability)—number of patent applications in year t; Absorptive capacity—backward reference lag (per Joo et al., 2016); R&D inputs—R&D staff, R&D staff ratio, R&D expenditure and intensity (R&D/sales); Complementary assets—ProductionCAS (net fixed assets/sales), MarketCAS (selling expenses/sales), HRCAS (wage expenses/sales), plus value-added ratio; Originality—1 − Σ_k (NCITING_k/NCITING_t)^2; Technological diversity—1 − Σ_k (Nkt/Nt)^2; External technology dependence—1 − (Σ self-citations / Σ total citations). Controls: firm age, staff size, year, industry segment, IIDegree, concentration, etc. Feature selection: To reduce overfitting and improve generalization, applied three algorithms in scikit-learn: SelectKBest (F-statistic p-values), RandomForestRegressor (Gini importance), and Permutation Importance. Took intersection of top-ranked features, yielding 15 features (e.g., Path-creating, TechCapability, Appropriability, Cumulativeness, Opportunity, HRCAS, MarketCAS, Diversity, Dependence, Uncertainty, Absorptive, R&D Staff Ratio, Originality, ProductionCAS, TCT). Econometric analysis: Used Python 3.9.13 with PyProcessMacro 1.0.12. Model 58 tested mediation of TechCapability and moderation by Appropriability (for path-creating → TechCapability and TechCapability → TechCatchup), while estimating direct effect of Path-creating on TechCatchup (tests H1a, H1b, H2a, H2b). Model 1 tested moderation of Cumulativeness on the direct Path-creating → TechCatchup effect (H3). 5000 bootstrap iterations, 95% CI, variables mean-centered; VIFs < 10. Causal machine learning: Binary treatment for Path-creating (value>0 indicates adoption). Employed meta-learners via CausalML with LightGBM Regressor as base learner to estimate ATE, CATE/ITE. Computed SHAP values to interpret heterogeneous treatment effects and trained a policy/decision tree (PolicyLearner) to derive decision rules indicating when path-creating is effective.

Key Findings

Feature selection: Cumulativeness ranked highest across all three selectors; TechCapability and Path-creating also ranked highly. Redundant R&D measures were reduced via selection (kept R&D staff ratio). Econometrics (Model 58): - Mediation and moderation with Appropriability: Path-creating positively predicts TechCapability (b ≈ 29.63, 95% CI [19.12, 40.15], p < 0.001). The interaction Path-creating × Appropriability on TechCapability is positive and significant (95% CI [62.28, 271.79]), indicating appropriability strengthens the effect of path-creating on capability. In the TechCatchup equation, Path-creating has a positive direct effect (b ≈ 0.20, 95% CI [0.07, 0.34], p < 0.05); TechCapability is positive; TechCapability × Appropriability is positive (95% CI [0.01, 0.02]), indicating appropriability strengthens capability’s effect on catch-up. Conditional indirect effects via TechCapability are non-significant at low appropriability (mean −1 SD; CI spans zero) but significant and positive at mean and high appropriability levels. Thus, H1a, H1b, H2a, and H2b are supported. Econometrics (Model 1): - Moderation by Cumulativeness: Path-creating’s direct effect on TechCatchup remains positive (b ≈ 0.23, 95% CI [0.09, 0.36]). The interaction Path-creating × Cumulativeness is significantly negative (b ≈ −0.02; 95% CI [−0.04, −0.01]), indicating cumulativeness weakens the direct benefit of path-creating (H3 supported). Conditional direct effects: significant and larger at low cumulativeness (effect ≈ 0.36; 95% CI [0.16, 0.55]) and moderate at mean (≈ 0.23; 95% CI [0.09, 0.36]); non-significant at high cumulativeness (≈ 0.09; CI [−0.10, 0.28]). Other notable covariate effects include positive roles of TCT, Opportunity, Uncertainty and negative roles of Diversity, Dependence, R&D staff ratio, and Originality in the TechCatchup equation. Causal ML: - ATE of Path-creating on TechCatchup is positive and significant: 0.25 (95% CI [0.22, 0.29]), aligning with OLS results. - ITE distribution is heterogeneous: while most ITEs are positive, a left tail indicates cases where path-creating is counterproductive. - SHAP analysis of treatment effects highlights TechCapability as the top positive driver. Cumulativeness and Appropriability exhibit complex, non-monotonic influences on the treatment effect; Uncertainty and Absorptive capacity often align with positive treatment effects; Opportunity and Diversity tend to relate negatively; other features show mixed roles. - Decision tree rules: Of 14 identified paths, 9 predict effectiveness and 5 predict ineffectiveness of path-creating. High TechCapability at the root generally increases effectiveness (5 of 6 high-capability paths effective). For low TechCapability firms, effectiveness depends on combinations of regime conditions: e.g., low uncertainty with high cumulativeness, or high uncertainty with high appropriability, can render path-creating effective despite low capability. These complex combination patterns are not captured by linear models and provide actionable decision guidance.

Discussion

The findings demonstrate that adopting a path-creating strategy helps latecomer firms narrow their technological gap with incumbents, primarily by building technological capability. Technological appropriability enhances both the capability-building effect of path-creating and the translation of capability into catch-up performance, indicating that protected innovation environments strengthen the strategic payoff from creating new technological paths. Conversely, high technological cumulativeness weakens the direct benefits of path-creating because incumbents’ accumulated knowledge and path dependence raise barriers; however, once a certain level of capability is achieved, cumulativeness can still aid convergence through capability-driven channels. Integrating econometrics with causal ML validates average causal effects while uncovering heterogeneity: not all firms benefit equally from path-creating. The SHAP and decision-tree analyses reveal conditions under which path-creating fails (e.g., low capability with unfavorable appropriability/uncertainty profiles) and when it succeeds (e.g., high capability, or specific combinations of uncertainty and appropriability/cumulativeness). This enriches theory by clarifying boundary conditions of path-creating and provides managers with interpretable decision rules to tailor strategies given their technological capability and regime context.

Conclusion

The study integrates econometric hypothesis testing with causal machine learning to elucidate the mechanisms and boundary conditions of latecomers’ path-creating strategies. Econometric results confirm that path-creating improves technological catch-up, mediated by technological capability, with appropriability strengthening both capability formation and its impact on catch-up, and cumulativeness weakening the direct effect of path-creating. Causal ML corroborates a positive average treatment effect and, crucially, reveals substantial heterogeneity in individual treatment effects. SHAP explanations and decision-tree rules provide interpretable guidance for when path-creating is likely to be effective. Practically, firms should consider path-creating when they can simultaneously invest in capability building and when appropriability conditions are favorable; they should be cautious in highly cumulative regimes unless capability is sufficiently high or other supportive conditions exist. Future research could generalize beyond Chinese listed high-tech firms, extend to other industries and institutional contexts, and incorporate richer measures of tacit knowledge and innovation outcomes beyond patents.

Limitations

Measures rely heavily on patent data, which may not capture tacit knowledge, strategic patenting behavior, or all dimensions of innovation; different patenting propensities across firms and over time may bias measures. The sample is limited to Chinese listed high-tech firms (2007–2019), which may constrain generalizability across countries, institutional settings, and industries (e.g., capital-intensive or labor-intensive sectors). Further, some variables required rolling-window and weighted constructions that may be sensitive to data quality; despite feature selection and resampling to mitigate imbalance, model dependence and measurement error remain possible.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

A multimodal deep learning approach for the prediction of cognitive decline and its effectiveness in clinical trials for Alzheimer’s disease

C. Wang, H. Tachimori, et al.

Computer Science

Using the interest theory of rights and Hohfeldian taxonomy to address a gap in machine learning methods for legal document analysis

A. Izzidien

Computer Science

On the Readiness of Scientific Data Papers for a Fair and Transparent Use in Machine Learning

J. Giner-miguelez, A. Gómez, et al.

Medicine and Health

Design and Analysis of a Deep Learning Ensemble Framework Model for the Detection of COVID-19 and Pneumonia Using Large-Scale CT Scan and X-ray Image Datasets

X. Xue, S. Chinnaperumal, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny