logo
ResearchBunny Logo
Machine learning prediction of the Madden-Julian oscillation

Earth Sciences

Machine learning prediction of the Madden-Julian oscillation

R. Silini, M. Barreiro, et al.

Discover how researchers Riccardo Silini, Marcelo Barreiro, and Cristina Masoller are pushing the boundaries of weather forecasting! By harnessing the power of artificial neural networks, they achieved a remarkable MJO prediction skill of up to 60 days, challenging traditional climate models. This groundbreaking study reveals the potential of AI in understanding tropical atmospheric variability.

00:00
00:00
~3 min • Beginner • English
Introduction
The Madden-Julian Oscillation (MJO) is a leading source of sub-seasonal predictability and influences tropical weather, monsoons, tropical cyclones, and extratropical teleconnections, with potential impacts on ENSO. Dynamical model skill for MJO prediction depends on model physics and initial conditions, with recent ensemble-mean skills generally in the 20–25 day range, and ECMWF exceeding four weeks. Skill varies with initial amplitude and phase, season, background state, and extratropical influences, with boreal winter typically yielding higher skill. While machine learning (ML) has been applied in climate for parameterization and forecasting (e.g., ENSO) and to reconstruct MJO indices or correct dynamical-model bias, direct ML prediction of the MJO had not been explored. This study addresses that gap by training two neural networks to forecast the real-time multivariate MJO (RMM) index over 1979–2020, assessing skill using standard COR and RMSE metrics, and analyzing dependence on season and initial phase.
Literature Review
Prior work shows dynamical MJO prediction skill is sensitive to model physics and initial conditions. In 2014, ensemble-mean skill peaked at 28 days (ECMWF) and 24 days (ABOM2), with most models at 15–20 days; more recently, ECMWF exceeds four weeks and most models reach 20–25 days. Skill depends on initial amplitude and phase, season, background mean state, and extratropical influence, with boreal winter skill up to 25–26 days and ECMWF approaching five weeks. ML has been used to parameterize convection and ocean mixing, to forecast ENSO, to reconstruct the historical MJO index, and to bias-correct MJO forecasts from dynamical models; however, direct ML-based MJO prediction had not been attempted prior to this work.
Methodology
Targets: The study forecasts the daily Wheeler–Hendon real-time multivariate MJO (RMM) index components (RMM1, RMM2), from which amplitude and phase are derived. Data: RMM1/2, amplitude, and phase since June 1, 1974 were obtained from BOM/IRI; due to missing early data, the period January 1, 1979–December 31, 2020 is used. Data are L2-normalized. Train/validation/test split preserves temporal order: train (1979-01-01 to 2006-11-30), validation (2006-12-01 to 2015-11-30), and test (2015-12-01 to 2020-12-31). Neural networks: Two architectures are used, each with an input layer of 300 units. (1) Feed-forward neural network (FFNN): the last point of the input layer feeds a hidden layer of 64 units (ReLU activation), followed by an output layer with τ units (τ = 5, 10, …, 100 lead days). Each input/output timestep consists of two values (RMM1, RMM2). (2) Autoregressive recurrent neural network (AR-RNN): a single gated recurrent unit (GRU) layer with 64 units; predictions are decomposed into individual time steps with an initial warm-up period, feeding predictions back autoregressively to update the hidden state; GRU is chosen to mitigate vanishing gradients and for computational efficiency. Training: Mean squared error (MSE) loss, Adam optimizer, batch size 16, maximum 10 epochs with early stopping (patience = 1); training stops when validation error increases. Backtesting consists of training on the train set, hyperparameter tuning via the validation set, and a single final evaluation on the held-out test set. Skill metrics: Bivariate correlation coefficient (COR) and root-mean-squared error (RMSE) between observed (a1,a2) and forecasts (b1,b2) as a function of lead time τ; standard thresholds COR = 0.5 and RMSE = 1.4 define skill. Amplitude and phase errors are computed by transforming (RMM1, RMM2) to polar coordinates (A, φ); mean amplitude error EA(τ) and phase error Eφ(τ) are calculated following established formulas. Seasonality and initial-phase analyses: Skill is further evaluated by season (DJF, MAM, JJA, SON) and by initial MJO phase (1–8) using the FFNN for efficiency.
Key Findings
- Overall skill: Averaged over all seasons and initial amplitudes >1, FFNN and AR-RNN perform similarly. Using COR=0.5 as the threshold, both achieve prediction skill of about 26–27 days, comparable to most dynamical models (except ECMWF). RMSE never exceeds 1.4 up to 60 days, implying RMSE-based skill extends beyond 60 days. - Amplitude vs phase: Both ANNs predict phase well, but systematically underestimate amplitude; absolute amplitude error increases with lead time. - Seasonal skill (FFNN): COR-based prediction skill varies strongly by season: MAM ~23–24 days; SON ~16–17 days; JJA ~31 days; DJF ~45 days. DJF shows the largest RMSE (good correlation but larger magnitude differences), whereas JJA shows low RMSE (more accurate amplitudes despite lower COR than DJF). SON has larger RMSE than MAM. - Phase propagation error: In JJA the predicted propagation is faster than observed; in DJF, MAM, and SON it is slower. - Dependence on initial phase and season (FFNN): In DJF, starting in phases 1, 2, 5, and 8 yields very high COR-based skill (up to ≥60 days), but phase 7 drops below 20 days; combining COR and RMSE suggests ~60-day skill for phases 1 and 2. In SON, phases 4 and 7 reach ~50 days, while other phases are <20 days. In MAM and JJA, skill is more uniform across phases, with maxima around ~40 days and minima <20 days in phases (MAM: 1, 3, 8; JJA: 1, 5, 8). Across seasons, initial phase 1 gives very high skill in DJF but low in others; phase 2 shows >35-day skill from December to May; phase 4 shows >40 days in transition seasons; phase 6 is high from March to August; phase 7 is >40 days from June to November; phase 8 is consistently <20 days. - RMSE thresholds by season: For MAM and JJA, RMSE does not cross 1.4 up to 100 days.
Discussion
The study demonstrates that relatively simple and computationally efficient neural networks (FFNN and GRU-based AR-RNN) can achieve MJO forecast skill comparable to most state-of-the-art dynamical models, with mean COR-based skill near 26–27 days and substantially longer skill when evaluated by RMSE. The models are particularly adept at predicting MJO phase but tend to underestimate amplitude, with season-dependent propagation biases (faster in JJA, slower in other seasons). Seasonal and initial-phase dependencies are pronounced: DJF and certain starting phases (notably 1 and 2) allow extended skill windows approaching 60 days, whereas SON and specific phases (e.g., 8) remain challenging. The contrast between COR and RMSE across seasons (e.g., high COR but larger RMSE in DJF, low RMSE in JJA) highlights differences between directional coherence and amplitude accuracy. These findings indicate ML can provide competitive sub-seasonal MJO predictions at lower computational cost and can complement dynamical models, while underscoring the importance of season and initial phase in predictability.
Conclusion
Two neural-network approaches (FFNN and GRU-based AR-RNN) trained on the RMM index provide competitive MJO forecasts: mean COR-based skill of 26–27 days, with extended skill for specific seasons and initial phases (up to ~60 days), accurate phase predictions, and an underestimation of amplitude. The approach is simple and fast, suggesting potential for operational use or as a complement to dynamical models. Future work could explore deeper or more complex architectures, alternative training and validation strategies (e.g., walk-forward validation, multiple train-test splits), incorporation of additional predictors or alternative MJO indices (OMI, OOMI, ROMI, FMO), and targeted methods to correct amplitude biases and season/phase-dependent errors.
Limitations
- Amplitude underestimation is systematic and grows with lead time; DJF shows larger RMSE despite high COR. - Skill depends strongly on season and initial phase, with notably poor performance for some phases (e.g., phase 8) and seasons (SON). - The networks are intentionally simple (single hidden/GRU layer, short training with early stopping) and may not capture all complexities. - Validation uses a single temporal split; more robust schemes (e.g., multiple splits or walk-forward) were not implemented due to computational cost. - The study forecasts the RMM index rather than raw fields; inputs are limited to the RMM historical series and may constrain achievable skill.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny