logo
ResearchBunny Logo
Mapping out-of-school adolescents and youths in low- and middle-income countries

Education

Mapping out-of-school adolescents and youths in low- and middle-income countries

V. A. Alegana, C. Pezzulo, et al.

This research delves into the physical accessibility and non-attendance rates at secondary schools among adolescents and school-age youths in Tanzania, Cambodia, and the Dominican Republic, conducted by V. A. Alegana, C. Pezzulo, A. J. Tatem, B. Omar, and A. Christensen. Discover how distance to school impacts attendance and the suggested solutions to improve accessibility.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper addresses persistent inequalities in access to secondary education among adolescents and school-age youths in low- and middle-income countries, where progress on reducing out-of-school rates has stagnated for nearly a decade. The authors highlight geographic distance to school as a key barrier alongside socio-economic, cultural, and gender factors. Leveraging new data on population distribution, school locations, and socio-demographics, the study aims to quantify fine-scale geographic accessibility and model secondary school attendance at 1 km resolution for Tanzania, Cambodia, and the Dominican Republic, to better understand sub-national heterogeneities and inform SDG 4.3.1 monitoring and policy.
Literature Review
Methodology
Study countries: Tanzania (mainland), Cambodia, and the Dominican Republic were selected to represent diverse LMIC contexts in Africa, Southeast Asia, and Latin America. School locations: Governmental data sources provided geocoded secondary school locations—Tanzania: 3,258 schools; Cambodia: 1,615 (College, Lycee G10–12, Lycee G7–12); Dominican Republic: 4,618 schools. Age ranges aligned with national definitions (approx. 13–19 years depending on country). Attendance data: Demographic and Health Surveys (DHS) cluster-level data were used—Tanzania 2015–16 (n=595 clusters), Cambodia 2014 (n=611), Dominican Republic 2013 (n=476). DHS employs two-stage stratified sampling with cluster coordinates. Adjusted net attendance rates (ANAR) were computed following MEASURE DHS guidance, accounting for weights, strata, and the official secondary school-age per country. Numerators counted de facto secondary school-age adolescents attending any level (primary/secondary/higher) during the academic year; denominators were the de facto secondary school-age population. Rates were aggregated to clusters for mapping. Geospatial covariates: Land cover from MERIS GlobCover (300 m); elevation from HydroSHEDS/SRTM; roads from OpenStreetMap, NGA, and MapCruzin; night-time lights from VIIRS DNB; gridded population at 1 km from WorldPop (RF-based dasymetric disaggregation). Travel time modelling: Using AccessMod 5.0, the team built a 1 km friction surface combining land cover, roads, and slope (Tobler’s hiking function) to estimate multi-modal travel speeds (e.g., 80 km/h on primary roads, 5 km/h walking on tertiary roads, 10 km/h cycling on residential roads). Travel time rasters to the nearest secondary school were computed nationally. Spatial modelling of attendance: A two-stage model-based geostatistical approach was implemented. First, covariate selection via bestglm identified a parsimonious set per country from candidate variables including travel time to nearest school, enhanced vegetation index, night-time lights, and temperature variables. Second, Bayesian hierarchical spatial models were fitted in R-INLA using the SPDE approach with Matérn covariance. The linear predictor included selected covariates and a spatial random field; Gaussian likelihood was used for weighted proportions. Penalized complexity priors were set for SPDE parameters. Model calibration and sharpness were assessed via PIT and CPO (leave-one-out). Predictive performance was evaluated using a 20% hold-out: mean prediction error, MAE, RMSE, and Pearson correlation. Outputs included 1 km maps of predicted non-attendance and associated uncertainty (width of 95% credible intervals).
Key Findings
- School proximity and travel time: Mean straight-line distance to nearest secondary school was 6.6 km in Tanzania, 3.3 km in Cambodia, and 1.3 km in the Dominican Republic. Mean travel time: 0.8 h (~50 min) Tanzania, 0.4 h (~25 min) Cambodia, 0.1 h (~10 min) Dominican Republic. - Covariate selection: Night-time lights and temperature variables were consistently selected as important predictors of attendance. Travel time to the nearest school was not retained in the predictive models but was strongly associated with non-attendance at sub-national level. - Model performance (20% hold-out): Pearson correlation between predictions and observations—Tanzania 0.62; Cambodia 0.78; Dominican Republic 0.87. MAE—Tanzania 0.29; Cambodia 0.11; Dominican Republic 0.20. RMSE—Tanzania 0.39; Cambodia 0.14; Dominican Republic 0.27. - Out-of-school estimates: • Tanzania (2016): Approximately 57.3% (54.5–58.3%) of secondary school-age adolescents/youths out of school, translating to ~2.8 million. Regions with highest non-attendance included Dodoma, Katavi, Mbeya, Mtwara, Njombe, Rukwa, Shinyanga, Simiyu, and Tabora (eight regions >60% out-of-school; ~1.01 million combined). • Cambodia (2014): ~40.0% (37.4–42.3%) out of secondary school, ~0.59–0.60 million. Môndól Kiri had ~50.2% (44.4–58.1%). 11 of 25 regions exceeded national averages. • Dominican Republic (2013): ~10.7% (9.7–11.7%), ~0.1 million. About half of regions (n=17) exceeded the national average; these regions contained ~68.2% (n≈70,398) of adolescents and school-age youths out of school. - Association of access with non-attendance: At administrative level 1, non-linear GAM fits showed strong associations between travel time and non-attendance with R2: Tanzania 73.3%, Cambodia 68.8%, Dominican Republic 87.5%. - Fine-scale heterogeneity: 1 km maps revealed substantial within-country variation in accessibility and attendance; uncertainty maps reflected areas with sparse data or higher model uncertainty.
Discussion
The study demonstrates substantial sub-national heterogeneity in secondary school attendance across three LMICs and shows that physical accessibility, proxied by distance/travel time to the nearest secondary school, is strongly associated with non-attendance at regional level. By integrating school locations, population distribution, and DHS-derived attendance with a Bayesian geostatistical framework, the authors provide fine-resolution estimates suitable for targeting interventions and monitoring SDG 4.3.1. The strong association between travel time and non-attendance suggests that improving geographic access—such as by expanding the secondary school network or reducing travel burdens—could help lower out-of-school rates. The maps and metrics enable identification of high-burden regions (e.g., several Tanzanian regions with >60% out-of-school), supporting prioritization of resources. Although travel time was not selected as a direct predictor in the pixel-level models (with urbanization and temperature proxies performing better), its sub-national association with non-attendance highlights the relevance of accessibility within broader socio-spatial contexts.
Conclusion
The paper contributes a scalable geospatial framework to estimate secondary school accessibility and non-attendance at 1 km resolution, triangulating governmental school locations, DHS cluster-level attendance, and ancillary geospatial covariates. Across Tanzania, Cambodia, and the Dominican Republic, large sub-national disparities were quantified, with strong regional associations between longer travel times and higher non-attendance. These outputs can inform intervention planning, including optimizing school placement and targeting regions with high unmet need. Future work should extend to fine-scale optimization models for school location and consider additional determinants of non-attendance at both micro (household) and macro levels, including direct/indirect costs and quality of provision, to better disentangle barriers beyond physical access.
Limitations
Uncertainty in estimates arises from survey sampling design, limited cluster density in some areas, and overall model fit, as reflected in the width of the 95% credible intervals. DHS cluster coordinate displacement and geolocation imprecision may affect spatial interpolation but are reported in prior work to have minimal impact on such modelling. The predictive models did not retain travel time as a pixel-level covariate (with night-time lights and temperature dominating), which may limit direct inference about accessibility effects at the finest scale. Only three countries were analyzed, and some country-specific factors (e.g., costs, school quality) were not explicitly modelled.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny