
Agriculture
Preseason maize and wheat yield forecasts for early warning of crop failure
W. Anderson, S. Shukla, et al.
This groundbreaking research showcases the impressive ability of global preseason crop yield forecasts, especially for maize and wheat, thanks to innovative climate forecasting techniques. The study features contributions from authors including Weston Anderson, Shraddhanand Shukla, and others, revealing significant advancements in early warning systems for food security.
~3 min • Beginner • English
Introduction
Effective humanitarian responses to food crises require months of planning for activities such as distributing drought-tolerant seeds, securing funds, arranging food aid logistics, and scaling nutrition assistance, creating a clear need for early warning information 6–12 months in advance. Routine crop yield forecasts at year-long leads are largely absent; most research and operational systems focus on within-season forecasts issued a few months before harvest rather than preseason forecasts made before planting. Preseason efforts exist for select crops and regions but typically at shorter lead times due to climate forecast limitations. Recent advances in multi-year forecasting of the El Niño–Southern Oscillation (ENSO) using machine learning and model-analog approaches provide an opportunity to extend preseason lead times, given ENSO’s widespread and well-documented influence on global crop yields. This study asks whether leveraging modern multi-year ENSO forecasts can enable skillful preseason yield forecasts for maize and wheat globally at lead times up to and beyond one year, to support anticipatory food security decision-making.
Literature Review
Past preseason or seasonal yield forecasting studies have targeted specific crops and regions, including sugarcane in South Africa, maize in Zimbabwe, wheat and sugarcane in Australia, rice in the Philippines, and wheat/maize/sugarbeet in parts of Europe, as well as soybeans in the US. Global-scale efforts have primarily emphasized within-season prediction using seasonal climate indices and multi-model ensembles. ENSO has been repeatedly implicated as a dominant driver of interannual crop yield variability worldwide and has been used in earlier preseason systems. Advances in ENSO prediction skill—via model-analogs and deep learning—now extend to year-two forecasts, with known teleconnection structures varying by region, season, and crop sensitivity. The literature also notes challenges such as the spring predictability barrier, secular variations in forecast skill, and the need to account for the timing of crop sensitivity within the growing season.
Methodology
Overview: The study develops a probabilistic preseason crop yield forecast system that issues country-level forecasts of the probability that end-of-season yields fall in the below-normal tercile, conditioned on ENSO forecasts during the crop’s sensitive portions of the growing season. Forecasts are only issued prior to the start of the vegetative season.
ENSO observations and forecasts: ENSO state is characterized using the Oceanic Niño Index (ONI; ERSSTv5) with anomalies computed using a 30-year fair-sliding climatology behind the forecast period. ENSO forecasts come from NOAA/PSL and University of Colorado/CIRES model-analog systems, chosen for long hindcast availability (dating to the 1850s) and multi-year lead capability comparable to initialized dynamical models.
Crop calendars and target seasons: Growing seasons for maize and wheat are defined using GEOGLAM crop calendars and masks. Forecasts are issued only for countries in which the majority (>50%) of the main-season crop area is in the vegetative–reproductive phase during the target 3-month season, and only if the forecast is issued before the season begins. For each growing season, forecasts are generated for all overlapping 3-month target seasons within the vegetative period and then averaged to a single probability per lead prior to harvest.
Yield data processing: Country-level FAOSTAT yield time series are quality controlled: values flagged as imputed or missing are removed; low-variance segments potentially reflecting infilling are screened using second differences (flagging stretches where four consecutive second-differences are <50 kg/ha); a reporting issue for Argentina wheat is corrected; countries with fewer than five remaining years are dropped. Historical boundary changes are handled by aggregating relevant states (e.g., former USSR, Sudan/South Sudan, Yugoslavia, Belgium–Luxembourg, Czechoslovakia). Subnational datasets (US, China, Australia) from the 20th Century Crop Yield Statistics are included and updated to 2020. Expected yields are computed via a low-frequency Gaussian filter (σ=3 years), and percent yield anomalies are defined as 100*(Y−Ye)/Ye to account for nonstationary variance.
Forecast model: Historical yield anomalies are split into terciles. For each target season, historical years are categorized by ENSO phase (El Niño, La Niña, neutral) using ONI. For each phase, the empirical probability that yields fall in the below-normal tercile is computed. The issued forecast for the below-normal tercile is a linear combination of the phase-conditional probabilities weighted by the forecast probability of each ENSO phase during the target season: P(below-normal)=p(EN)*P(below|EN)+p(N)*P(below|N)+p(LN)*P(below|LN). Lead time is defined relative to harvest start, not to the target season.
Evaluation: A leave-one-out cross-validation framework is used. Forecast skill is assessed using the area under the ROC curve (ROC score); ROC>0.6 denotes “skillful.” Reliability diagrams are computed where skillful, relating forecast probability to observed frequency. Skills are summarized by lead-time bins (6–9, 10–13, 14–17, 18–21 months). Sensitivity analyses examine (1) Single-season selection: operationally selecting one 3-month target season per country using training-data Spearman correlation between yields and Niño3.4; (2) Best-season (a posteriori) selection: choosing the target season with highest ROC in 6–13 month leads; (3) Perfect ENSO: replacing probabilistic ENSO forecasts with perfect deterministic ENSO phase information to estimate an upper bound on achievable skill.
Design choices: Forecasts are exclusively preseason (no updates once vegetation has begun) to maximize utility for anticipatory action, recognizing that in-season earth observation approaches may outperform climate-based models later in the season.
Key Findings
- Global preseason skill: Using only ENSO information, skillful preseason forecasts (ROC>0.5) are achievable up to and beyond one year prior to harvest in substantial areas. At a year lead, skill spans approximately 15% of maize and 30% of wheat harvested areas worldwide.
- Spatial patterns: Maize skill is strongest in Southeast Africa and Southeast Asia; wheat skill is strongest in parts of South and Central Asia (notably India), Australia, and Southeast South America.
- Long-lead wheat skill: Wheat forecasts remain skillful in some locations at 18–21 month leads; forecasts with ROC>0.6 persist over ~20% of harvested area at 18–21 months, though ROC>0.65 covers only ~5%.
- Distribution across countries: Both crops have ROC>0.6 in ~8–12% of countries. The larger wheat harvested-area coverage arises from concentration of wheat production in countries with skill (e.g., India, Australia, Argentina) rather than broader country-level prevalence of skill.
- Lead-time dependence: In maize (South Africa, Thailand), ROC declines roughly monotonically with lead and becomes unskillful near ~20 months; in several wheat-producing countries, declines are slower and can be non-monotonic.
- Reliability and sharpness: At 10–13 month leads, forecasts are generally reliable and sharp in South Africa, Zimbabwe, India, Argentina, and Thailand; Iran tends to show under-confident forecasts. Issued probabilities often range from <20% to >60% (except Argentina, near climatological ~33%).
- Quantitative benchmarks at 6–9 month leads: For wheat, countries with ROC>0.55, >0.6, >0.65 represent 34%, 25%, and 18% of harvested area; for maize, 27%, 7%, and 2%, respectively.
- Target-season selection improves skill: Operational single-season selection at 10–13 month leads increases the share of harvested area with ROC>0.6 from 3% to 15% (maize) and from 23% to 30% (wheat). Combining a-posteriori best-season selection with perfect ENSO information yields an upper bound of skillful coverage near 39% (maize) and 40% (wheat) at 10–13 month leads.
- ENSO forecast characteristics: Skill dips for forecasts issued before the spring predictability barrier (Feb–Apr). ENSO forecast skill can re-emerge at longer leads (e.g., late-summer issuances improving again near lead ~15), reflected in non-monotonic crop forecast skill.
- Mechanistic insight: Enhanced year-two predictability of La Niña following strong El Niño events likely contributes to long-lead skill in India; La Niña years there nearly always avoid poor yields.
- Case study (1982/83 El Niño): At 10–13 month leads, the system correctly indicated elevated risk in Southeast Africa and Southeast Asia; at 14–17 months, it missed failures in West Africa and the US due to ENSO forecasts reverting near climatology, illustrating limitations of ENSO-only and year-two forecast sharpness.
Discussion
The study demonstrates that modern multi-year ENSO forecasts can underpin skillful preseason probabilistic forecasts of maize and wheat yields at operationally relevant lead times of up to a year. Skill patterns align with known ENSO teleconnections and with the timing of climate sensitivity within crop growth stages. Targeting specific, ENSO-sensitive sub-seasons within the growing period substantially improves skill and offers a practical pathway for operational systems. Nonetheless, even with perfect knowledge of future ENSO phase, the attainable global coverage of skillful forecasts appears capped at roughly 40% of harvested areas for maize and wheat, revealing inherent limits of ENSO-based predictability. Thus, while year-two ENSO forecast improvements would directly enhance long-lead crop yield prediction, broadening forecast drivers to include other modes of climate variability and region-specific climate-yield linkages is needed to expand coverage and robustness. The results underscore both the promise for anticipatory action (e.g., input distribution, funding mobilization) and the boundaries imposed by teleconnection strength, forecast barriers (spring predictability barrier), and non-ENSO influences on yields.
Conclusion
By exploiting advances in multi-year ENSO prediction, this work establishes that preseason, ENSO-driven probabilistic forecasts can provide skillful early warnings of below-normal maize and wheat yields up to a year before harvest across significant portions of global cropland, with wheat retaining skill in some regions beyond 18 months. Selecting a single, climate-sensitive target season within the growing period yields substantial operational gains, approaching the ceiling implied by perfect ENSO information. To further improve coverage and lead times, future research should (1) incorporate additional climate modes and regionally relevant predictors, (2) refine identification of the most climate-sensitive growth windows, and (3) continue advancing year-two ENSO forecast skill to bolster long-lead performance.
Limitations
- ENSO-only driver: Forecast skill is limited to regions and seasons with strong ENSO teleconnections; many yield failures arise from non-ENSO factors, constraining coverage even under perfect ENSO knowledge (~40% ceiling at 10–13 month leads).
- ENSO forecast limitations: Skill and sharpness degrade around the spring predictability barrier and in year-two leads; non-monotonic forecast behavior introduces uncertainty at long leads.
- Data quality and processing: FAOSTAT data required extensive screening for imputed or low-variance values; residual reporting issues and aggregation of countries (due to boundary changes) may affect precision. Limited years in some countries reduce statistical robustness.
- Model simplicity: The categorical tercile approach, while robust, cannot capture magnitude of anomalies or within-season dynamics; no assimilation of in-season earth observations is performed by design.
- Temporal issuance constraint: Forecasts are only issued pre-vegetative-season; potential skill gains from updating during the season are not explored here.
Related Publications
Explore these studies to deepen your understanding of the subject.