logo
ResearchBunny Logo
Forecasting trends in food security with real time data

Food Science and Technology

Forecasting trends in food security with real time data

J. Herteux, C. Raeth, et al.

This research, conducted by Joschka Herteux, Christoph Raeth, Giulia Martini, Amine Baha, Kyriacos Koupparis, Ilaria Lauzana, and Duccio Piovani, unveils a groundbreaking quantitative methodology for forecasting food consumption levels in Mali, Nigeria, Syria, and Yemen. Leveraging the WFP's real-time monitoring system, this study highlights the superior performance of Reservoir Computing in creating a robust early warning system for food insecurity.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the need for robust, timely early warning systems to anticipate food insecurity, especially amid compounding crises such as conflict, climate extremes, and economic shocks exacerbated by the COVID-19 pandemic and the war in Ukraine. Humanitarian agencies like the World Food Programme (WFP) require rapidly updated, quantitative assessments to plan and target assistance. The World Food Programme’s Real-Time Monitoring (RTM) framework provides daily, sub-national indicators including the Food Consumption Score (FCS)–derived prevalence of insufficient food consumption. The research question is whether machine-learning-based time series forecasting methods can accurately predict the prevalence of insufficient food consumption 60 days ahead at sub-national resolution, updating daily, to support operational decision-making. The study motivates a 60-day horizon as operationally valuable for logistics and retargeting in humanitarian contexts where distributions and programmatic adjustments occur on weekly to monthly cycles. The paper positions machine learning as a promising approach due to its ability to model complex, locally specific dynamics without extensive domain-specific mechanistic modeling, provided adequate training data are available.
Literature Review
The paper situates its contribution within a growing body of work applying machine learning to food security. FAO and others have explored long-term, country-level forecasts; ML has been used to classify households by caloric intake (e.g., Uganda), and the World Bank has modeled Integrated Food Security Classification (IPC) phases across multiple countries, identifying drivers via LASSO and panel VAR, and incorporating news-derived text features that improved Random Forest predictions. Other studies used LSMS household data with LASSO to forecast FCS, HDDS, and rCSI (e.g., Malawi), XGBoost to predict FEWS NET IPC phase transitions, and deep learning to forecast FCS/HDDS (e.g., Burkina Faso), noting computational costs and limited transferability. Broader ML time-series forecasting successes span epidemiology, finance, energy, and climate. Recent work also shows the potential of text streams for early warning. Reservoir Computing (RC) has been shown to perform well in time-series forecasting, sometimes outperforming LSTMs, including for food prices. The paper aims to extend these insights to operational, daily RTM data for sub-national food security forecasting.
Methodology
Data: The target is the daily, sub-national prevalence of insufficient food consumption derived from the Food Consumption Score (FCS) from WFP RTM. Additional features include: - Food-based coping (rCSI-derived prevalence). - Climate: rainfall, rainfall anomalies (1- and 3-month), NDVI and NDVI anomaly (primarily dekadal), from CHIRPS and MODIS via WFP Climate Explorer. - Conflict: ACLED-based fatalities for battles, violence against civilians, and explosions/remote violence aggregated via 90-day rolling sums at sub-national level. - Economics: ALPS/PEWI (cereal/tubers price spike index), food and headline inflation (monthly, national), and daily currency exchange rates. - External/known future indicators: Ramadan (binary), day-of-year, seasonal calendars. Availability varies by country (Yemen, Syria, Mali, NE Nigeria) as summarized in Table 1 of the paper. The framework handles multidimensional time series and allows inclusion of exogenous variables known in the future without forecasting them (e.g., Ramadan, crop calendars). Algorithms: The study compares ARIMA, CNN, LSTM, XGBoost (from prior work), and an ensemble Reservoir Computing (RC) approach based on Echo State Networks (ESNs). RC uses a fixed, randomly initialized reservoir (sparse recurrent layer) and trains only a linear readout (ridge regression). The model operates iteratively in closed loop for multi-step forecasting; known-future exogenous inputs are fed directly to reduce compounding error. An ensemble of 100 ESNs with identical hyperparameters but different random initializations outputs forecasts aggregated by the median to stabilize randomness and improve robustness. The RC implementation emphasizes simplicity (linear readout, limited hyperparameter tuning) yet retains nonlinear memory dynamics in the reservoir to capture complex patterns. Dynamic feature selection: To enhance transparency and adaptability across heterogeneous contexts, the authors introduce a feature-group hyperparameter in the grid search, choosing among grouped sets: FCS (target history + day-of-year + Ramadan), FCS+ (adds crisis coping, seasonal calendars), climate, economics, and all. Frequencies of selection across splits indicate autoregressive groups (FCS/FCS+) are most often predictive, while climate/economics groups are occasionally selected, evidencing situational informativeness. This dynamic selection was applied to RC, CNN, and LSTM (not ARIMA). Training and evaluation: A large preliminary grid search (on Yemen) identified impactful hyperparameters and feasible ranges, then applied across countries. RC grid included spectral radius (ρ), input scaling (S_input), ridge regularization (β), differencing option, and feature group. CNN and LSTM grids varied learning rate, lookback window length, architecture parameters (kernel size, layers for CNN; units, dropout for LSTM), differencing, and feature group. ARIMA grid covered (p,d,q). Walk-forward optimization evaluated models over sequential splits: for each split, hyperparameters were chosen based on median RMSE over preceding splits (to avoid leakage and reduce outlier sensitivity), then used to forecast the current test window (60 days) and compute RMSE. Training times per hyperparameter configuration were under ~1 minute on a standard local machine. The methodology was tested at first administrative level across Mali, NE Nigeria, Syria, and Yemen, with regions lacking data excluded.
Key Findings
- Overall forecasting performance: RC showed slower error growth over the 60-day horizon and became the preferred model beyond approximately the 15th forecast step. At the end of the 60-day window, the aggregated median RMSE is reported as 4.9 percentage points (Fig. 3a), with RC outperforming CNN, LSTM, and ARIMA after early steps. - Performance under change regimes: As the magnitude of variation in the target increases, all models’ errors rise due to the rarity of extreme conditions, but RC’s advantage widens, especially under severe deterioration, where RC outperforms all benchmarks by a clear margin (Fig. 3b). - Per-country aggregation: RC and ARIMA emerge as generally better performers than CNN and LSTM, though no single model dominates across all countries (Fig. 3c), reflecting heterogeneous regional dynamics. - Classification of trend categories (60-day windows): Using thresholds |Δ| < 0.04 (No Change), Δ < −0.04 (Improvement), and Δ > 0.04 (Deterioration), RC achieves the best overall performance across models. Reported metrics (Fig. 4) include: Total accuracy 0.39; class accuracies: Deterioration 0.07, Improvement 0.33, No Change 0.67; Precision for Deterioration 0.05; Recall for Deterioration 0.26. Models generally show conservative behavior (bias toward stable predictions), yielding low recall on deterioration. Elsewhere in the discussion, the authors note a total accuracy around 0.43 for RC and around 0.36 accuracy when focusing on deteriorating or improving classifications. - Comparison to prior work: On 30-day forecasts (Supplementary Section 6), RC outperforms the prior XGBoost approach on the same dataset. The good performance of ARIMA in several aggregations aligns with findings in other studies. - Training efficiency and robustness: Training times are short (<1 minute per hyperparameter configuration), and RC exhibits ease of tuning with generally reasonable hyperparameter regions, supporting ensemble use and operational retraining. - Feature selection trends: Autoregressive feature groups (FCS, FCS+) are most frequently selected across RC, LSTM, and CNN; climate and economics groups are selected less often but contribute in specific contexts.
Discussion
The findings support RC as a strong choice for operational, quantitative early warning of insufficient food consumption. RC’s architecture balances the ability to capture nonlinear, temporal dependencies with simplicity and low overfitting risk, offering advantages over deep learning models (LSTM, CNN) in data-limited, noisy, high-dimensional settings typical of humanitarian applications. The closed-loop forecasting with exogenous known-future variables (e.g., Ramadan, seasonal calendars) is well suited to 60-day horizons and daily rolling updates. Despite ARIMA’s consistent competitiveness (especially in some country aggregations), RC generally provides better long-horizon performance and handles multidimensional inputs without extensive feature engineering. The dynamic feature selection enhances transparency and adaptability, often highlighting autoregressive components as primary drivers while allowing climate and economic variables to contribute when informative. The conservative tendency across models, including RC, to predict stability reflects data imbalance (fewer severe deterioration episodes) and contributes to low recall for deterioration; addressing this is important for early warning utility.
Conclusion
This work introduces a daily-updating, 60-day forecasting methodology for sub-national insufficient food consumption using WFP RTM data and exogenous drivers, demonstrating that an ensemble Reservoir Computing approach outperforms CNN, LSTM, ARIMA, and a prior XGBoost baseline in most settings, particularly for longer horizons and under deteriorating conditions. The method is computationally efficient, robust to noisy, high-dimensional inputs, and operationally practical, laying the groundwork for a global data-driven early warning system for food insecurity. Future research directions include: extending the forecasting horizon; improving detection and recall of severe deterioration; refining RC and benchmarking against best alternatives; assessing weekly/monthly granularities to reduce noise and preprocessing burdens; integrating independent forecasts for secondary variables; introducing custom weighting and data augmentation focused on rare deterioration events; and streamlining deployment and maintenance in humanitarian operations.
Limitations
- Data imbalance: The training data are biased toward small-variation trajectories, with few severe deterioration episodes, contributing to conservative forecasts and low recall for deterioration. - Increasing multi-step error: Iterative closed-loop forecasting accumulates errors over 60-day horizons; performance degrades with larger target variations. - Heterogeneity and transferability: Sub-national series exhibit diverse dynamics without clear country-level properties; model performance varies by country and region, and deep learning models show limited transferability without careful tuning. - Secondary variable forecasting: When future exogenous values are unknown, projecting secondary variables introduces additional error; the RC is not optimized to forecast such variables. - Operational data issues: Some regions lack data; varying temporal/spatial resolutions require interpolation and preprocessing, which the authors aim to reduce by moving to coarser (weekly/monthly) granularity in future work. - Classification performance: Despite leading performance, RC still shows modest overall accuracy and particularly low recall for deterioration, limiting early warning sensitivity.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny