logo
ResearchBunny Logo
Mobility and phone call behavior explain patterns in poverty at high-resolution across multiple settings

Economics

Mobility and phone call behavior explain patterns in poverty at high-resolution across multiple settings

J. E. Steele, C. Pezzulo, et al.

Discover how call detail records from mobile phone metadata are reshaping our understanding of poverty indicators in Namibia, Nepal, and Bangladesh. This research, conducted by Jessica E. Steele and colleagues, reveals the power of user mobility and call behavior metrics in estimating socioeconomic status, emphasizing the role of local context in poverty alleviation efforts.

00:00
00:00
~3 min • Beginner • English
Introduction
The United Nations’ first Sustainable Development Goal is poverty eradication, which requires timely and reliable estimates of who is poor and where they live. Traditional subnational poverty estimation relies on censuses and household surveys, often within small area estimation frameworks, but these data can be infrequent, delayed, incomplete, and spatially coarse in many low- and middle-income countries. Consequently, researchers are exploring alternative data sources that can provide higher temporal frequency and finer spatial detail. Call detail records (CDRs) collected by mobile network operators capture metadata about communications and often include information on user location and social networks. These data can reveal mobility patterns, usage behavior, and recharge activity that correlate with socioeconomic status, and are passively collected at high spatial and temporal resolution. However, CDR-based analyses face biases: mobile ownership is skewed toward educated, male, urban, and wealthier groups; multiple operators partition market share; coverage and affordability vary geographically; and datasets may not span full annual cycles, introducing seasonal bias. Despite these limitations, numerous studies have demonstrated the utility of CDRs for socioeconomic inference at individual and population scales. Broader adoption depends on demonstrating internal validity within countries and generalizability across contexts, while ensuring privacy, transparent and verifiable methods, and minimal burden on operators. This study asks whether a common, easily replicable set of CDR-derived features can generalize across multiple countries to predict poverty and wealth. Using comparable CDR features from Namibia, Nepal, and Bangladesh, the authors build Bayesian spatial models to produce national-scale poverty maps and evaluate performance via out-of-sample validation. They compare country-specific models using all available features to generalized models using only a common subset available in all three countries, and quantify spatial distributions and counts of people in poverty to assess implications for SDG monitoring.
Literature Review
Prior work has established that CDRs, including features of mobility, social networks, basic phone usage, and recharge behavior, correlate with socioeconomic status and can support data-for-development applications. Studies have shown relationships between network diversity and economic development, and that CDR-derived measures can predict poverty and wealth at individual and aggregate levels. At the same time, literature highlights key biases: uneven mobile phone ownership and usage patterns (skewed toward men, urban, educated, and wealthier populations), varying market shares across operators, limited coverage in rural/remote areas, and seasonal variability affecting mobility and phone use. Complementary data sources such as satellite imagery, night-time lights, and user-generated GIS platforms have been used to proxy economic activity and improve poverty estimation, especially where traditional data are sparse or outdated. Methods such as hierarchical Bayesian models and INLA have been applied for spatial inference with geolocated survey data. This study builds on these foundations by explicitly testing generalizability of a minimal, replicable set of CDR features across three countries, and by comparing performance against richer, country-specific feature sets.
Methodology
Study design and spatial framework: The authors processed all datasets to common projections, resolutions, and extents. Mobile tower coverage areas were approximated using Voronoi tessellations, forming the spatial units of analysis. CDR-derived indicators were computed at the individual level and then aggregated to the tower/Voronoi level, typically by mean, sum, or mode for users whose home tower fell within each polygon. DHS household clusters (geolocated to jittered centroids) were matched to Voronoi polygons; if multiple clusters occurred within one polygon, their mean wealth index was used. Mobile phone data: CDRs were provided by MTC (Namibia), Ncell (Nepal), and Grameenphone (Bangladesh). Indicators encompassed basic usage (incoming/outgoing call and text counts and durations, percent nocturnal communications), mobility (number of unique towers visited, entropy of places, radius of gyration, percent interactions from home, number of active days, frequent places), and social network features (interactions per contact, entropy of contacts, percent Pareto). Bangladesh also included revenue/consumption-related features (top-up amounts/frequencies), multimedia messaging, and internet usage. Home location assignment was primarily by last call of the day (Namibia, Nepal) after sensitivity analysis; Bangladesh used most-used tower. Indicators were log-transformed for normality, and multicollinearity was addressed via Pearson correlation thresholds and variance inflation factor filtering; Bangladesh required dimensionality reduction from ~150 to 14 non-collinear variables. Survey data: Mean DHS wealth index per cluster was used for Namibia (2013), Nepal (2011), and Bangladesh (2011). Clusters were selected via two-stage stratified sampling. Wealth index, an asset-based measure, served as the response variable aggregated to Voronoi polygons. People in poverty: Using WorldPop 2011 population rasters, the authors overlaid model outputs and summed population within each Voronoi classified as poor based on DHS quintiles (lowest two quintiles: poorer and poorest). They compared totals and spatial distributions of the poor between full and generalized models. Statistical modeling: For each country, hierarchical Bayesian areal models were fit using integrated nested Laplace approximations (R-INLA). Models incorporated a spatial random effect over the Voronoi adjacency graph using a Besag intrinsic conditional autoregressive prior with gamma/loggamma hyperpriors on precision parameters, accommodating spatial autocorrelation. Models were trained on a random 70% of polygons; predictive performance was evaluated on a 30% holdout using r² and RMSE. Two model types were fit: (1) full models using all non-collinear CDR covariates available per country; (2) generalized models using a common set of five features that were statistically significant in at least one full model and available in all countries: number of places (unique towers visited), outgoing call count, percent nocturnal communications, radius of gyration, and entropy of places. Posterior means and standard deviations provided prediction maps and uncertainty estimates.
Key Findings
- A generalized set of five CDR-derived features—number of unique towers visited, outgoing call count, percent nocturnal communications, radius of gyration, and entropy of places—explained approximately 50–65% of national variance in socioeconomic status across Namibia, Nepal, and Bangladesh. - Model performance (out-of-sample, 30% test set): - Namibia: Full model r²=0.66, RMSE=0.48; Generalized model r²=0.65, RMSE=0.48. - Nepal: Full model r²=0.61, RMSE=0.53; Generalized model r²=0.60, RMSE=0.54. - Bangladesh: Full model r²=0.64, RMSE=0.48; Generalized model r²=0.50, RMSE=0.57. - Country-specific important predictors (full models): - Namibia: Outgoing text counts; number of users with home at each tower; outgoing call counts; percent nocturnal communications; mobility metrics. - Nepal: Entropy of contacts; percent interactions from home; outgoing call duration; mobility measures (radius of gyration, number of places). - Bangladesh: Incoming text counts; top-ups (recharge amounts/frequencies); multimedia messaging; internet usage; percent nocturnal communications; number of places. - Counts of people in poverty (DHS lowest two quintiles) differed between model types: - Namibia: Full 909,432 vs Generalized 857,761. - Bangladesh: Full 17,107,057 vs Generalized 9,832,711 (generalized underestimates and maps more areas as middle class). - Nepal: Full 6,436,490 vs Generalized 6,707,748 (generalized predicts more poor, especially in the south-southeast, aligning better with higher absolute numbers of poor in densely populated regions with lower mobility). - Strongest generalized predictors: number of unique towers visited and percent nocturnal calls. Outgoing call count had strong effect in Namibia; radius of gyration and entropy of places were prominent in Nepal. - Temporal alignment matters: Namibia, with concurrent CDR and DHS years, showed best performance and minimal difference between full and generalized models.
Discussion
The study demonstrates that a compact, easily replicable set of CDR-derived features capturing user mobility and call behavior can generalize across diverse contexts to explain a substantial portion of socioeconomic variation. This supports the feasibility of incorporating aggregated, anonymized CDRs into routine poverty mapping and SDG monitoring, particularly when census data are outdated or unavailable. The findings align with expectations that higher mobility and greater outgoing communications correlate with higher socioeconomic status, while a higher share of nocturnal (off-peak) communications associates with lower status. However, country-specific context and ancillary datasets are crucial for interpretation. In Bangladesh, inclusion of direct consumption proxies (top-ups, internet, MMS, SMS) was necessary to distinguish the poorest populations; omitting them reduced accuracy and under-identified poverty. In Nepal, mobility-driven generalized models better captured regions with high absolute numbers of poor due to population density and lower mobility, highlighting that incidence versus counts can lead to different spatial targeting decisions. The modeling framework leveraging INLA with spatial random effects effectively handles geolocated survey data and provides uncertainty estimates, facilitating policy-relevant mapping. Temporal concordance between CDR and survey data enhances performance, as seen in Namibia. Integrating complementary data sources (e.g., population density, remote sensing) can further improve predictions, especially in rural areas with sparse towers.
Conclusion
This work provides the first cross-country generalization of CDR-derived features for poverty mapping, identifying five replicable metrics that predict asset-based wealth with competitive accuracy across Namibia, Nepal, and Bangladesh. The approach enables high-resolution, timely estimates to support SDG-aligned targeting and monitoring, particularly where traditional data are limited. Key contributions include: (1) a validated minimal feature set usable across operators and contexts; (2) a robust Bayesian spatial modeling pipeline with uncertainty quantification; and (3) empirical evidence that ancillary, context-specific CDR features (e.g., top-ups, internet/SMS activity) can significantly enhance detection of the poorest populations. Future work should: test income and consumption-based poverty metrics to evaluate sensitivity to short-term fluctuations; expand integration with satellite and other big data sources; assess the influence of tower distribution on mobility-derived covariates; improve temporal alignment through repeated waves to enable dynamic monitoring; and develop data-sharing frameworks with MNOs that ensure privacy, transparency, and operational feasibility.
Limitations
- Selection bias: Mobile ownership and usage are skewed toward male, educated, urban, and wealthier populations; single-operator datasets reflect only that operator’s market share. - Coverage and access: Variability in network coverage, handset affordability, and electricity availability, especially in rural/remote areas, may exclude the poorest and introduce spatial bias. - Temporal bias: Some CDR datasets do not span full annual cycles; seasonality in mobility and usage can bias indicators. Nepal and Bangladesh had mismatched years between CDR and DHS, potentially reducing performance. - Feature availability: Important predictors (e.g., top-ups, SMS, internet usage) were not uniformly available across countries, limiting generalized model accuracy (notably in Bangladesh). - Poverty metric scope: DHS wealth index captures asset-based, longer-term status, not short-term income/consumption changes; models may not reflect rapid socioeconomic shifts. - Spatial confounding: Mobility metrics (e.g., radius of gyration, entropy) may partly reflect tower placement/density rather than pure behavior; further assessment is needed. - Anonymization and aggregation preclude individual-level inference; results apply at Voronoi/tower-area scales.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny