logo
ResearchBunny Logo
Examining the population flow network in China and its implications for epidemic control based on Baidu migration data

Social Work

Examining the population flow network in China and its implications for epidemic control based on Baidu migration data

S. Wei and L. Wang

This study by Sheng Wei and Lei Wang explores the intricate population flow network in China and its vital implications for controlling epidemic spread, particularly during and prior to the Spring Festival. By analyzing Baidu migration data, the research unveils a concentrated migration pattern that correlates significantly with COVID-19 case distributions, emphasizing the need for effective lockdown strategies.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates how spatial patterns of intercity population movements in China structure urban networks and influence epidemic spread, focusing on the period before and during the 2020 Spring Festival (Chunyun). Using Baidu migration data, the authors aim to: (1) examine the hierarchical structure of the national population flow network during the non-Chunyun period via weighted degree and betweenness centrality to identify key node cities; (2) detect and map closely connected subnetworks (communities) of Chinese population movement before Chunyun as a baseline for understanding subsequent covid-19 spread; and (3) assess the role of population flows from Wuhan in the diffusion of covid-19 during Chunyun by correlating outbound migration volumes with confirmed cases and visualising intra-provincial movements using a Chord Diagram. The context underscores the growing use of big data to understand complex mobility, the significance of China’s massive seasonal migration during Chunyun, and the urgent importance of mobility-informed epidemic control measures following the Wuhan lockdown on 23 January 2020.
Literature Review
The paper situates its contribution within research on human mobility and complex networks. Prior work has used mobile phone records to analyse mobility patterns, geographic borders, and urban structure, offering high accuracy but raising cost and privacy barriers. Alternative open big data sources such as train and flight schedules, and geo-tagged social media, have been used to infer intercity movements at large scales. For China, national-scale, uniform migration data are difficult to obtain, motivating the use of Baidu Map migration indices as an accessible, large-sample proxy for OD flows. Complex network theory provides tools (e.g., weighted degree and betweenness centrality) to characterise nodes’ roles and connectivity within mobility networks. The paper also references literature linking mobility networks to epidemic spread, highlighting the relevance of movement intensity and network structure for understanding disease transmission dynamics, particularly in the context of Chunyun.
Methodology
Data: Baidu Map migration scale index (http://qianxi.baidu.com/) provided intercity OD flows among 367 major Chinese cities. After cleaning, the authors obtained 433,787 OD records from 1–22 January 2020. Values were multiplied by 100 to convert to integers. The period is split into non-Chunyun (1–9 January) and Chunyun (10–22 January). Cities are nodes; OD flows define weighted edges. Covid-19 data (city-level confirmed cases) were sourced from Tianditu’s coronavirus map, with counts taken as of 6 February 2020 (14 days post-lockdown) to reflect incubation. Network measures: Weighted Degree Centrality (WDC) S_i = Σ_j W_ij captures node hierarchy via total incident flow. Betweenness Centrality (BC) measures the fraction of shortest paths between node pairs that pass through a node, identifying hubs/bridges controlling flow. Community detection: The network was partitioned into subnetworks via a modularity-based community detection method; higher modularity (typically 0.3–0.7 in real networks) indicates clearer community structure. Analysis: The authors mapped OD, WDC, and BC during the non-Chunyun period, detected subnetworks, analysed inter-subnetwork OD links, and examined Chunyun-era spread of covid-19 by: (i) identifying Wuhan’s subnetwork structure (Hubei Province), (ii) using a Chord Diagram to visualise top destinations of migrants from Wuhan and proportions within vs. outside Hubei, and (iii) correlating outbound migration from Wuhan (log-transformed) with confirmed case counts (log-transformed) in other cities.
Key Findings
- Spatial concentration: During the non-Chunyun period, population flows were predominantly east of the Hu Huanyong Line, with hotspots centred on provincial capitals and regional hubs. - Hierarchical structure: WDC maps show core cities formed across China. Notably, Chengdu, Zhengzhou, and Xi’an (central/western China) exhibited high attractiveness despite not being in the three leading eastern agglomerations (Yangtze River Delta, Beijing–Tianjin–Hebei, Pearl River Delta), reflecting industrial transfers from East to Central China. - Control hubs: BC distribution broadly aligns with WDC. Wuhan ranked 6th in both WDC and BC, indicating strong influence in national flow even pre-Chunyun. - WDC–BC divergence: Some cities (e.g., Foshan, Dongguan) had high WDC but relatively low BC due to concentrated flows with a few nearby cities (e.g., strong Dongguan–Shenzhen linkage), limiting broader bridging roles. - Community structure: The national network split into 12 subnetworks whose boundaries largely coincide with provincial borders, indicating administrative divisions and geographic proximity strongly shape movement communities. Inter-subnetwork OD revealed a primary structure anchored by Beijing, Shanghai, Guangzhou, and Chengdu–Chongqing, linked to northern periphery cities (Changchun, Harbin, Shenyang, Hohhot). Long-distance links rely on aviation, though competition from expanding high-speed rail is increasing. - Hubei/Wuhan focus: Hubei (with Wuhan as capital) forms a single subnetwork; Wuhan’s strongest ties are intra-provincial. During Chunyun, approximately 68.93% of migrants from Wuhan went to cities within Hubei (notably Xiaogan and Huanggang). The paper also notes roughly 70% intra-provincial movement for Hubei. - Distance effect and policy response: Cities closer to Wuhan suffered more severe outbreaks; strict lockdowns and interprovincial flow controls (e.g., Wuhan, Huanggang) were implemented to curb spread. - Statistical association: The number of migrants from Wuhan is significantly and positively correlated with confirmed covid-19 cases in destination cities during Chunyun (r = 0.943, p < 0.01, N = 321). A noted outlier is Wenzhou, which had relatively low inbound flow from Wuhan but high cases, possibly due to higher business travel and contact rates. Overall, mobility network structure and flow intensity strongly relate to early epidemic diffusion patterns, underscoring the utility of movement data for outbreak control.
Discussion
The findings address the research questions by demonstrating: (1) a clear hierarchy of cities within China’s population flow network where core cities concentrate flows and hubs (high BC) exert control over connectivity; (2) a subnetwork structure that largely aligns with provincial boundaries and geographic proximity, reflecting administrative and spatial determinants of movement; and (3) a robust link between outbound flows from Wuhan and early covid-19 spread, validating the predictive value of mobility networks for epidemic control during Chunyun. Policy significance includes evidence supporting timely provincial and municipal lockdowns and interprovincial travel restrictions to mitigate spread. The identified core hubs and inter-subnetwork corridors can guide targeted interventions, emergency resource allocation, and transport policy. The analysis also highlights competition and complementarity between aviation and high-speed rail for intercity connectivity, informing transport planning. The results endorse building emergency management systems leveraging big data on mobility to anticipate and respond to outbreaks.
Conclusion
This study leverages Baidu migration data to map and analyse China’s intercity population flow network before and during Chunyun 2020, revealing hierarchical core cities, provincially aligned subnetworks, and strong associations between Wuhan’s outbound migration and early covid-19 diffusion. Methodologically, combining WDC, BC, and community detection provides a comprehensive view of node importance and community structure. Empirically, the dominance of intra-provincial flows (e.g., in Hubei) and the high correlation (r = 0.943) between migration and cases underscore the value of mobility data for outbreak prediction and control. Future research directions include: assembling longitudinal mobility datasets to capture seasonality and structural change; integrating multi-modal transport data (rail/air/bus) to examine mode-specific roles; assessing impacts of mobility on industrial development and regional planning; validating and triangulating Baidu data with mobile signalling and ticketing datasets; and tracing fine-scale spreading paths within cities using higher-resolution mobility signals to enhance preparedness and response strategies.
Limitations
- Data coverage and bias: Baidu migration indices derive from smartphone users; groups with lower smartphone usage (e.g., children, seniors) may be underrepresented. - Validation needs: Reliance on a single open-data source requires triangulation with independent datasets (e.g., rail/air/bus ticket sales, mobile signalling) to confirm reliability. - Temporal scope: The analysis covers 1–22 January 2020; broader longitudinal data are needed to capture long-term dynamics and seasonality. - Modal specificity: The study does not explicitly disentangle contributions of transport modes; integrating mode-specific networks could clarify mechanisms of spread. - Spatial granularity: City-level analysis cannot reveal intra-urban spreading paths; finer-scale mobility signals would improve resolution. - Causal inference: Correlations between flows and cases are strong but do not by themselves establish causation, especially amid concurrent interventions and reporting lags.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny