Earth Sciences

A data-driven approach to rapidly estimate recovery potential to go beyond building damage after disasters

S. Loos, D. Lallemant, et al.

In the wake of disasters, identifying areas at risk of prolonged non-recovery is critical. This innovative research conducted by Sabine Loos, David Lallemant, Feroz Khan, Jamie W. McCaughey, Robert Banick, Nama Budhathoki, and Jack W. Baker leverages data from the 2015 Nepal earthquake to shine a light on ongoing vulnerabilities that could impede recovery efforts. By focusing on social and environmental factors, this study could have transformed recovery strategies from the very beginning.

00:00

~3 min • Beginner • English

Index

Introduction

The paper addresses persistent inequalities in post-disaster recovery, noting that policies often prioritize loss-based damage metrics that favor pre-disaster asset holders and overlook long-term social needs. Conventional post-disaster information systems and rapid assessments (e.g., PDNA) emphasize economically quantifiable damage, while non-traditional data streams largely quantify building damage. The authors propose shifting focus from immediate damage to non-recovery—identifying households and areas that fall behind over time—to inform more equitable and sustainable recovery policies. They demonstrate a rapid, data-driven approach that leverages readily available census, remotely sensed, and modeled data to estimate non-recovery in housing reconstruction after the 2015 Nepal earthquake. The central research question is whether commonly available geospatial and census-type variables can predict where households are unlikely to complete reconstruction within several years, thereby revealing obstacles and vulnerabilities that damage-only metrics miss.

Literature Review

The study builds on literature documenting disaster-exacerbated inequalities and critiques of loss-based aid frameworks. It references housing recovery policies and alternatives (needs-based, area-targeting, subsidiarity), and limitations of PDNA processes that often underrepresent social needs. The authors draw on vulnerability and resilience frameworks, including social vulnerability indices and sustainable livelihoods concepts, while critiquing index-based approaches for post-disaster contexts due to validation challenges and context specificity. Prior Nepal-focused studies highlight compounded hazards (e.g., landslides), remoteness, market access, labor/material costs, governance changes, and socio-economic factors influencing recovery. The paper positions its empirical, outcome-focused modeling as a complementary and more actionable alternative to pre-disaster, index-based vulnerability assessments.

Methodology

Study area: Eleven most affected districts outside Kathmandu Valley from the 2015 Nepal earthquake, emphasizing rural contexts targeted by the Earthquake Housing Reconstruction Program (EHRP). The program provided grants for severely damaged/collapsed homes over a five-year period. Survey data: Round 5 (Sept–Oct 2019) of The Asia Foundation’s Independent Impact and Recovery Monitoring survey was used (n=5857 overall). The analysis included households in six severely/crisis-hit rural districts overlapping the study area (n=3484), and further restricted to those reporting partial or full damage (n=3376) to control for initial damage and EHRP targeting. Outcome variable: binary indicator of non-reconstruction 4.5 years post-event (1 = not completed, n=727; 0 = completed/started, n=2649). Predictors: Initial set of 32 variables representing sociodemographic, economic, environmental, and geographic factors, selected via exploratory interviews with local stakeholders, Nepal-specific recovery studies, and broader vulnerability/resilience theories. Only variables with data readily available in the weeks to days after a disaster were considered (from census, remote sensing, modeled datasets). Data harmonized to extract predictor values at surveyed household coordinates. Variable selection: Removed highly collinear predictors (Pearson r > 0.75). Used random forest with an inserted simulated noise variable and Gini importance to identify predictors consistently outperforming noise. Bootstrapped training data 1000 times; retained variables appearing >75% of runs (12 variables), then removed 4 with negligible relationships after partial dependence inspection, yielding 8 final predictors. Final predictors (8): - Hazard exposure: Earthquake Shaking Intensity (MMI via USGS ShakeMap); Rainfall-Triggered Landslide Hazard (BGS index). - Rural accessibility and poverty: Remoteness (hours to nearest municipal HQ, World Bank; multimodal travel time), Tree Cover (% within ~30-minute walking catchment from Landsat-derived canopy), Food Poverty Prevalence (% by LGU via small area estimation). - Reconstruction complexity: Population Density (WorldPop, people per km²), Tap Water (% households with tap water per ward, 2011 census), Topographic Slope (° from CGIAR DEM). Modeling: Random forest probability model to estimate P(non-reconstruction | predictors). Training/testing split via stratified 6-fold partitioning (5 folds train ~84%, 1 fold test ~16%), ensuring similar outcome proportions and spatial coverage. Hyperparameters tuned via grid search minimizing MSE; trees grown to minimum nodesize of 10% bootstrap sample; probability at terminal nodes estimated as proportion of Y=1. Spatial prediction: Predictors resampled/aggregated to 300 m × 300 m grid for mapping non-recovery probabilities across study region. Comparison made with independent damage data from Government of Nepal. Validation: ROC AUC used to assess performance. Random forest outperformed logistic regression. Average training AUC: RF 0.817 vs LR 0.636; test AUC: RF 0.725 vs LR 0.592. Visual alignment between predicted map and survey-based non-reconstruction patterns noted.

Key Findings

- Eight predictors strongly explain probability of non-reconstruction (not completing housing reconstruction after 4.5 years), grouped into: (1) hazard exposure; (2) rural accessibility and poverty; (3) reconstruction complexity. - Hazard exposure: • Areas experiencing high mainshock shaking (around MMI 8.5) have nearly 40% higher predicted probability of impeded reconstruction, independent of household-level damage control (likely due to community infrastructure and livelihood disruption). • Rainfall-triggered landslide hazard is associated with up to 20% higher predicted probability of non-reconstruction, reflecting compounding hazards post-earthquake. - Rural accessibility and poverty: • Greater remoteness (travel time to municipal HQs) predicts substantially lower likelihood of reconstruction; the most remote households were nearly 20% less likely to reconstruct. • Tree cover shows a non-monotonic relationship; above ~40% tree cover, households were more likely to reconstruct, suggesting natural capital benefits (e.g., local materials, potential slope stabilization), especially where accessibility is low. • Higher food poverty prevalence correlates with higher non-reconstruction, emphasizing the role of human capital and food security in long-term recovery trade-offs. - Reconstruction complexity: • Higher population density predicts higher non-reconstruction, potentially due to urban/peri-urban complexities (shared land, heritage constraints, permitting, reliance on external labor). • Higher tap water coverage at ward level correlates with higher non-reconstruction when holding other factors constant, possibly reflecting reconstruction logistics in denser, better-serviced settlements. • Steeper topographic slope increases non-reconstruction, likely due to site constraints and higher costs (e.g., retaining walls). - Spatial patterns: • Predicted non-reconstruction hotspots are dispersed across central, western, eastern, and southern parts of the study area, differing from the north-biased pattern of highest damage near the Himalayas. This indicates that areas with lower damage can still face persistent recovery barriers due to social, geographic, and environmental factors.

Discussion

The findings demonstrate that a rapid, data-driven estimate of non-recovery can reveal vulnerable communities that damage-only assessments miss. By empirically linking commonly available post-disaster data to a concrete recovery outcome (completion of reconstruction), the approach identifies context-specific obstacles (e.g., remoteness, food poverty, landslide risk) alongside broadly relevant factors (e.g., shaking intensity). This directly addresses the research aim of moving beyond damage metrics to inform equitable recovery planning. Policy relevance includes integrating ongoing risk (e.g., landslides) and social vulnerability (e.g., food insecurity, rural isolation) into grant eligibility, targeting, and support modalities, thereby helping mitigate spatial inequities caused by concentrating aid in highly damaged areas alone. Compared to index-based vulnerability mapping, the model provides an interpretable, validated, and actionable outcome metric with quantified uncertainty and can be adapted to other outcomes and contexts. The approach underlines that recovery is nonlinear and multifaceted, shaped by pre-existing vulnerabilities and practical constraints. Early availability of non-recovery estimates would support more nuanced decisions on where and how to invest in recovery capacity, balancing in-situ reconstruction versus resettlement, and considering community-level needs beyond building repairs.

Conclusion

The study introduces and demonstrates a rapid, data-driven approach to estimate non-recovery after disasters, focusing on housing reconstruction in Nepal following the 2015 earthquake. By relating surveyed recovery outcomes to readily available geospatial and census-type predictors, the model identifies key drivers of impeded reconstruction across hazard exposure, rural accessibility and poverty, and reconstruction complexity. The resulting maps and variable influences highlight regions and factors not captured by damage assessments alone, enabling earlier and more equitable recovery planning. The framework is generalizable: with appropriate local recovery outcomes and available predictors, it can be applied in other contexts and for other recovery dimensions (e.g., livelihoods, health, displacement) at different timescales. Future work should incorporate richer high-resolution social data (e.g., gender, caste, income), extend to multi-outcome, multi-hazard settings, and continue evaluating model transferability over time and across events.

Limitations

- Aid interactions: While government EHRP assistance was standardized, the model cannot control for external NGO or other assistance that may have influenced reconstruction in specific communities. - Temporal transferability: It is uncertain how well the specific Nepal model translates to future earthquakes; some predictors may remain relevant, but relationships could change. - Spatial resolution: Predictor data resolution limits capturing household-level heterogeneity; results are best interpreted as broader spatial patterns requiring ground validation. - Variable coverage: Important Nepal-specific social factors (e.g., gender, caste) were not included due to lack of readily available, high-resolution data; exclusion does not imply irrelevance. - Reliance on technical data: As with all data-driven models, ethical use and cautious interpretation are necessary; the model provides informative but uncertain predictions (test AUC ~0.725).

Related Publications

Explore these studies to deepen your understanding of the subject.

Environmental Studies and Forestry

A data-driven approach to identifying PFAS water sampling priorities in Colorado, United States

K. E. Barton, P. J. Anthamatten, et al.

Medicine and Health

A social networks-driven approach to understand the unique alcohol mixing patterns of tuberculosis patients: reporting methods and findings from a high TB-burden setting

K. Nagarajan, B. Palani, et al.

Economics

The CoRisk-Index: a data-mining approach to identify industry-specific risk perceptions related to Covid-19

F. Stephany, L. Neuhäuser, et al.

Medicine and Health

Combining Clinical and Genetic Data to Predict Response to Fingolimod Treatment in Relapsing Remitting Multiple Sclerosis Patients: A Precision Medicine Approach

F. L, C. F, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny