logo
Loading...
Stress-testing the international poverty line and the official global poverty statistics

Economics

Stress-testing the international poverty line and the official global poverty statistics

M. Moatsos and A. Lazopoulos

This insightful paper delves into the World Bank's approach to determining the international poverty line (IPL) for tracking SDG 1. Conducted by Michail Moatsos and Achillefs Lazopoulos, the study critically evaluates the method's validity and emphasizes the urgent need for alternative strategies in monitoring global poverty levels.... show more
Introduction

The study examines whether the World Bank’s international poverty line (iPL)—set at $2.15/day in 2017 PPP terms—provides a fit-for-purpose and defensibly valid measure of extreme poverty and whether resulting global poverty statistics are acceptably accurate and precise. Motivated by concerns raised by the World Bank’s Commission on Global Poverty (Atkinson Report) about methodological stability and total error estimation, the paper questions the World Bank’s decision to update the iPL using 2017 PPPs before 2030, contrary to Recommendation 10. The authors frame two research questions: (1) Is the World Bank’s iPL method internally and externally valid for measuring extreme (absolute) poverty? (2) Do the resulting statistics show acceptable accuracy and precision? The context includes the shift from 2011 to 2017 PPPs and the methodological change from a mean-based to a median-based reference across national poverty lines, with implications for SDG 1 monitoring and policy guidance.

Literature Review

The paper reviews the evolution of the iPL: earlier approaches (Ravallion et al., 2009; Ferreira et al., 2015) used small samples of national poverty lines (NPLs) and mean-based reference groups, whereas Jolliffe et al. (2022) derive the $2.15 iPL using the 2017 PPP round and the median of harmonized NPLs (Jolliffe & Prydz, 2016) across low-income countries, also proposing income-group-specific lines. Several implicit assumptions are noted: that country groupings are appropriate, the iPL measures extreme absolute poverty, and that accuracy/precision are adequate. A substantial literature critiques PPP use for poverty measurement (Allen, 2017; Deaton, 2001; Deaton & Heston, 2010; Deaton & Dupriez, 2011; Fischer, 2010; Moatsos, 2016, 2020; Reddy & Pogge, 2010; Subramanian, 2015; Sumner, 2010). Key concerns include: PPPs are designed for aggregate comparisons and may not reflect poor households’ baskets; reliability declines with larger cross-country differences and comparison-resistant items; changes in prices or consumption patterns in third countries can alter PPPs elsewhere; PPP estimates contain large relative errors (often ~15–17%); and each new PPP round has historically shifted poverty estimates (e.g., substantial changes in 2005; notable differences between 2011 and 2017 for Sub-Saharan Africa). The review also discusses the Atkinson Commission’s recommendation to hold the line constant in local real terms until 2030, and alternative approaches such as cost-of-basic-needs and multidimensional poverty (OPHI/MPI).

Methodology

The authors stress-test the iPL across four dimensions: internal validity, external validity, accuracy, and precision. Internal validity: They define an ideal iPL process with four steps—(1) globally agreed common definition of extreme poverty operationalized nationally; (2) countries report priced NPLs annually; (3) convert to a reference currency ideally with PPPs tailored to poor households; (4) derive a representative line per relevant country groups using defensible representativeness criteria. They then assess deviations of the current method from this ideal (e.g., heterogeneity of NPL concepts, use of general PPPs, choice of grouping, and reliance on the NPL median). They test the stability claim of the iPL value by examining medians across varying country sets and across alternative percentiles (40th–60th), and by simulating PPP uncertainty to assess how often stability holds. They also examine whether the iPL is effectively relative in nature by comparing it to relative-poverty-style constructions (percentiles of medians across NPLs and of global individual income/consumption). They analyze alternative groupings (e.g., using HDI bottom 40%) and compute the associated median line and implied global poverty rate. Accuracy: They propose minimization criteria that align the international measure with national standards by minimizing differences between iPL-based and NPL-based poverty statistics across countries: (i) misallocated individuals, (ii) headcount rate, (iii) poverty gap, (iv) poverty severity, and (v) Watts index. They compute Euclidean distances across these criteria to locate optimal iPL values per World Bank income group and compare performance gaps versus the official iPL. They also compare country-weighted versus population-weighted medians. External validity: They compare iPL-based poverty rates against widely used external benchmarks: OPHI/MPI; global cost-of-basic-needs (CBN) lines (Moatsos, 2016; Allen, 2017); and a food-only benchmark based on the EAT-Lancet healthy reference diet costs (Hirvonen et al., 2020), noting direction and magnitude of differences. Precision: They implement a Monte Carlo microsimulation incorporating PPP uncertainty (Rao et al., 2022). They draw 1,000 sets of PPP rates using mean and standard errors, and for each set bootstrap the median of NPLs 100 times to capture median-estimation uncertainty, yielding 100,000 iPL realizations. They derive distributions and 95% confidence intervals for the iPL and global poverty rates (2017 benchmark), and contrast with earlier 2011-based uncertainty results (Moatsos & Lazopoulos, 2021). They also examine how much reductions in NPL dispersion and PPP uncertainty would be required to narrow confidence intervals to acceptable widths, and illustrate sensitivity of trends via alternative randomized trajectories relative to official series and counterfactuals that isolate PPP-vintage versus method effects.

Key Findings
  • Updating methodology and PPP vintage materially changes poverty counts. Using the 2017 iPL method with 2011 PPPs yields an iPL of $1.77 (2011 PPP) and differences of about 144 million people (2011) or 111 million (2017) versus official. Using the 2011 method with 2017 PPPs yields an iPL of $2.33 and differences of about 190 million (2011) or 150 million (2017). The gap between these counterfactuals is about 200–400 million people (2.7–5.7 percentage points), depending on year.
  • Claimed stability of the iPL median across sets of 11–40 poorest countries is not robust under PPP uncertainty: such stability occurs in only ~1 out of 10,000 PPP draws; for sets of 11–20 countries it occurs in 49 out of 10,000 draws (increasing with slightly wider tolerance bands but still low). Stability appears coincidental to exact PPP means and the 50th percentile choice.
  • Relative-nature evidence: Considering the world distribution, the global median is about $7/day; the 40th–60th percentiles of the median are $2.80–$4.20/day (2017), implying poverty rates of 17.4%–32.33%. The current $2.15 iPL corresponds to roughly the 27th percentile of the global median, indicating a very low threshold consistent with a frugal relative measure across LICs and a proxy for extreme poverty elsewhere.
  • Minimization criteria identify alternative optimal lines by income group: LIC $1.90/day (35th percentile of LIC NPLs); LMIC $2.52/day (24th); UMIC $5.60/day (23rd); HIC $13.25/day (18th). Relative to these optima, the current iPL performs worse by 8% (LIC), 35% (LMIC), 11.5% (UMIC), and 72% (HIC) in Euclidean distance across the five indicators.
  • Weighting matters: A population-weighted median yields $2.04/day and the mean of the poorest 40 is $2.21/day (2017). This implies about a 1.1 percentage-point lower global headcount rate than official, or ~11.5% relative reduction.
  • External benchmarks tend to show higher poverty than the iPL: MPI comparisons across 2010–2020 suggest a mean absolute difference of ~12.2% (95% CI: ~0.06–52%) and median 5.2%, usually with higher poverty rates than iPL. CBN comparisons: vs. Moatsos (2016) mean absolute difference ~20.4% (95% CI: 0.2–54.7%), median 16.8%; vs. Allen (2017) across 20 countries mean ~9.4% (95% CI: 0–39.9%), median 3.5%—with roughly half above and half below iPL poverty rates. Food-only EAT-Lancet-based lines: mean absolute difference ~10% (95% CI: 0–38.7%), median 6%, generally higher than iPL.
  • Precision is low: Monte Carlo microsimulation yields a 2017 iPL 95% CI of $1.8–$2.8/day (mean ~2.194; median ~2.163). The 2017 global poverty rate mean is ~10.3% with a 95% CI of ~6.2%–17%, i.e., ~488–1,518 million people. Compared to 2011-based estimates (iPL 95% CI ~$0.91–$3.19; headcount 2%–28%, mean ~11%), uncertainty remains substantial though somewhat narrower. Achieving a ~2 percentage-point wide 95% CI would require implausibly large reductions (≈90% in NPL dispersion and ≈80% in PPP uncertainty).
  • Trend sensitivity: Randomized trajectories based on PPP uncertainty show intrinsic volatility exceeding the ~1 percentage-point COVID-related global poverty increase and indicate that PPP vintage changes have impacts comparable to PPP error-induced uncertainty; methodology and PPP-vintage changes nearly cancel each other in aggregate.
Discussion

Findings indicate the World Bank’s updated iPL methodology has questionable internal validity: stability claims do not survive PPP uncertainty; the line’s derivation behaves partly like a relative-poverty construct; and results are highly sensitive to grouping and weighting choices. External validity is weak: compared to OPHI/MPI, CBN-based thresholds, and food-cost benchmarks, the iPL generally yields substantially lower poverty estimates, diverging from methods more closely tied to basic needs or multidimensional deprivation. Accuracy is suboptimal: the iPL chosen via a median-of-NPLs approach does not minimize deviations from national standards across key poverty indicators, and alternative, better-fitting iPLs by income group differ markedly. Precision is low: wide confidence intervals for both the iPL and global headcount imply substantial uncertainty, undermining robust inference about levels, changes, and geographic distributions—especially around shocks such as COVID-19. These results suggest that relying on a single global threshold (or even four income-group thresholds) built on general PPPs and heterogeneous NPLs risks mismeasurement and misallocation in global policy prioritization. The evidence supports greater emphasis on nationally grounded, conceptually consistent definitions (e.g., Copenhagen Declaration) and alternative measurement frameworks that better align with absolute deprivation and reduce reliance on PPP comparability across disparate economies.

Conclusion

The analysis leads to largely negative answers to both research questions. The iPL method, as currently implemented, is not defensibly valid internally or externally for measuring extreme poverty, and the resulting statistics exhibit insufficient accuracy and precision for confident global monitoring. Stability claims are coincidental, the line blends absolute and relative features without a clear extreme-poverty anchor, and uncertainty is too large for precise global assessments. The authors recommend abandoning one-size-fits-all iPL approaches in favor of: (1) re-centering on a commonly agreed definition of absolute poverty (Copenhagen Declaration) and ensuring countries operationalize and price comparable NPLs; (2) prioritizing alternative approaches such as cost-of-basic-needs, capabilities-based, and multidimensional measures; (3) adhering to Atkinson Commission guidance to maintain lines in local currency updated by CPIs for the poor to 2030; and (4) enhancing transparency and explicit uncertainty reporting (total error accounting). They urge that alternative methods for monitoring global poverty be officially considered with urgency.

Limitations

The precision analysis incorporates only a subset of total error sources—primarily PPP uncertainty and median-estimation uncertainty—thus falling short of a complete total-error framework. Other uncertainties (e.g., imputed survey data, non-sampling errors in distributions or NPL construction, price collection biases, and PPPs for the poor vs. general PPPs) are not fully modeled. Comparisons of uncertainty between 2011 and 2017 PPP-based iPLs are not strictly one-to-one comparable due to methodological and data differences. The external benchmarks (MPI, CBN, EAT-Lancet food costs) are used as reference points rather than definitive standards and carry their own methodological limitations. Some replication differences may arise from rounding or data access constraints.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny