logo
ResearchBunny Logo
Different roads take me home: the nonlinear relationship between distance and flows during China's Spring Festival

Social Work

Different roads take me home: the nonlinear relationship between distance and flows during China's Spring Festival

X. Luan, H. Paryzat, et al.

Discover the intriguing nonlinear relationships between distance and intercity population flows during China's Spring Festival, revealed through Tencent Big Data and a Gradient Boosting Decision Tree model. This cutting-edge research, conducted by Xiaofan Luan, Hurex Paryzat, Jun Chu, Xinyi Shu, Hengyu Gu, De Tong, and Bowen Li, uncovers regional distinctions and the dynamic behaviors of population movement across various provinces.... show more
Introduction

The study examines human mobility during China's Spring Festival (Chunyun), one of the world’s largest periodic migrations. Prior research splits between physics-based models seeking universal laws and social-science approaches explaining socio-economic drivers, with limited integration and limited treatment of nonlinearity. The paper addresses two core questions: (1) Is the relationship between intercity flows and distance during the Spring Festival nonlinear, and what types characterize this nonlinearity? (2) How do these distance–flow relationships differ across regions/provinces in China’s uneven development context? The authors hypothesize and explore three distance–mobility intensity patterns—plateau, drop at provincial boundaries, and rebound at longer distances—and situate them within China’s uneven urbanization and hukou-driven periodic migration.

Literature Review

Mobility in China’s urbanization process: Rapid urbanization and coastal economic advantages have driven internal migration from inland to coastal regions, reinforcing regional disparities. Provincial capitals and higher-status cities attract migrants due to administrative and economic prominence. Rising housing costs and hukou constraints hinder permanent settlement, sharpening urban–rural and regional divides. The Spring Festival travel rush magnifies these dynamics by catalyzing large-scale return migration. Modelling mobility in urban China: Earlier work used surveys and censuses with coarse granularity. ICT-enabled big data (e.g., Tencent, Baidu, mobile signaling, LBS) now allow fine-resolution intercity mobility analysis. Methods fall into physics-based network/complex models (e.g., ERGM, WSBM) emphasizing universal patterns, and social-science regressions emphasizing theory but often assuming linearity. Explanatory machine learning (e.g., GBDT/GBM with partial dependence) offers a bridge by capturing nonlinear relationships and providing interpretability, though mostly applied to intra-urban transport; this study extends it to intercity mobility.

Methodology

Data: Tencent Migration Dataset capturing intercity population movements from geolocation data of smart device users. For the 2018 Spring Festival period (Feb 1–14), 40,289 daily OD records were collected nationwide; after cleaning (deduplication, annual aggregation), 20,155 prefecture-level OD records remained representing flows during the festival. Variables: Dependent variable Flow (intercity connection strength). Key independent variable: geographic distance between city centers. Controls: city population, GDP, fiscal expenditure, science/technology expenditure, education expenditure, environmental quality (air quality, PM2.5), population density, and industrial structure (secondary sector share), primarily from China City Statistical Yearbook. Preprocessing: Log transformations applied to mitigate skewness. Modelling: Gradient Boosting Machine/Decision Trees (GBM/GBDT) selected to model nonlinear relationships and interactions. Training involved iterative boosting using pseudo-residuals. Hyperparameters tuned with RandomizedSearchCV. Interpretation: Feature importance assessed; Partial Dependence Plots (PDPs) generated to visualize marginal effects of distance on connection strength overall and by province to test three hypotheses—Plateau (short-distance saturation), Drop (decline at provincial boundaries), and Rebound (increase in flows at longer distances due to attraction of developed provinces). Regional stratification: Analyses conducted nationwide and by provincial origin, grouped into eastern, central, northeastern, and western regions for comparative interpretation.

Key Findings
  • Three nonlinear distance–flow patterns identified: (1) Plateau: near-zero distance decay over short to moderate ranges within provinces; (2) Drop: sharp decline in connectivity around/after crossing provincial boundaries; (3) Rebound: recovery/peak in mobility intensity at longer distances, reflecting pull of economically advanced provinces. - Regional heterogeneity: Eastern provinces (e.g., Guangdong, Zhejiang, Jiangsu) show high initial plateau values with gradual decline; developed metro areas exhibit radial patterns. Central provinces (e.g., Henan, Jiangxi, Hunan) show intense mobility with plateau–drop–peak (rebound) patterns, indicating strong pulls from external clusters beyond certain thresholds (noted around log(distance) ≈ 7). Northeastern provinces (Heilongjiang, Jilin, Liaoning) show pronounced short-distance plateaus and steep medium-distance declines, reflecting limited attractive destinations mid-range. Western provinces (e.g., Gansu, Guangxi, Guizhou) feature the longest initial plateau distances but the lowest average intensities and rapid declines, indicating lower propensity for long-distance interprovincial mobility. - Variable importance: Geographical distance is the dominant factor influencing intercity connection strength, accounting for 57.26% of feature importance in the GBDT model; economic, population, and environmental factors contribute less individually. - Provincial special cases: Among 21 provinces analyzed, four—Heilongjiang, Jilin, Guangxi, Gansu—exhibit a two-stage decline in the drop phase (rather than a single direct drop), possibly due to locational factors and cultural ties focusing flows toward Beijing and Guangzhou. - Case illustrations: Hubei’s PDP indicates Plateau ≈ 527 km, Drop ≈ 907 km, and Rebound ≈ 1,978 km, consistent with strong intra-provincial and neighboring-province ties and selective longer-distance returns. Guangdong shows high initial connectivity with notable decline and a rebound linked to long-distance corridors (e.g., along the National High-Speed Railway). - Mobility maps (2019 Spring Festival descriptive): Major outflows before the festival from coastal megacities (Beijing, Shanghai, Guangzhou–Shenzhen, Dongguan, etc.) toward central and western hometowns; inflows peak in inland cities and labor-exporting regions during the holiday. Top outflow cities include Beijing, Shenzhen, Guangzhou, Shanghai, Dongguan, Suzhou; top inflow cities include Chongqing, Hengyang, Ganzhou, Zhoukou, Shangrao, Xinyang, etc. - Dataset scale: 40,289 daily records collected; 20,155 aggregated OD pairs analyzed. - Policy-relevant pattern: Coastal provinces display inclusive spatial structures that support migration from provincial peripheries; inland provinces focus on accommodating and retaining migrants, reflecting employment–settlement separation and hukou constraints.
Discussion

The findings empirically validate that distance–flow relationships during the Spring Festival are fundamentally nonlinear and vary across China’s uneven regional landscape. The Plateau pattern reflects integrated intra-provincial systems and cohesive socio-economic structures; the Drop captures administrative, socio-economic, and cultural boundary effects (reinforced by hukou); the Rebound highlights the strong gravitational pull of developed coastal provinces and megacities that can offset distance costs. These patterns elucidate the national phenomenon of employment–settlement separation: migrants work in distant, more developed cities but return to inland hometowns during holidays. Regional PDPs clarify why eastern provinces function as major attractors with extensive long-distance links, while central/western/northeastern regions act mainly as sources with strong local/regional ties and selective long-distance movements. Recognizing distance’s dominant influence (57.26%) within a multivariate ML framework strengthens the bridge between physics-inspired mobility laws and socio-institutional explanations, offering richer guidance for metropolitan delineation, inter-regional coordination, and talent-attraction/retention strategies.

Conclusion

This study integrates LBS big data (Tencent Migration) with an explanatory GBDT framework to reveal and interpret three nonlinear distance–flow patterns—plateau, drop, rebound—governing intercity mobility during China’s Spring Festival. It demonstrates substantial regional heterogeneity shaped by administrative boundaries, uneven development, and the hukou system. Contributions include: (1) identifying and typologizing nonlinear distance effects at intercity scale; (2) showing province-level variation and special cases (two-stage drops); (3) quantifying distance’s dominant role; and (4) providing interpretable ML evidence linking physics-based mobility regularities with social-institutional context. Policy implications call for equalization of public services, balanced regional development, and deeper hukou reform to reduce forced long-distance commuting/return migration and to enhance local settlement opportunities. Future research should extend temporal coverage to track evolution across years, incorporate micro-level trip purposes and traveler attributes, and conduct international comparisons to assess the universality of nonlinear distance–flow laws.

Limitations
  • Data aggregation: Tencent migration data are intercity aggregates lacking individual attributes and trip purposes, preventing separation of tourists from home-goers. - Temporal scope: Limited to a short festival window and lacks multi-year longitudinal depth, constraining analysis of temporal evolution. - Coverage heterogeneity: Insufficient data in some remote provinces (e.g., Xinjiang, Tibet, Qinghai, Ningxia) limit robust pattern identification there. - Generalizability: Findings are context-specific to China’s Spring Festival and hukou-influenced mobility; international comparative studies are needed to test universality of the identified nonlinear patterns.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny