logo
ResearchBunny Logo
Exploring multiyear-to-decadal North Atlantic sea level predictability and prediction using machine learning

Earth Sciences

Exploring multiyear-to-decadal North Atlantic sea level predictability and prediction using machine learning

Q. Gu, L. Zhang, et al.

Explore how coastal communities are battling the challenges of sea level rise and variability in the North Atlantic. This cutting-edge research conducted by Qinxue Gu, Liping Zhang, Liwei Jia, Thomas L. Delworth, Xiaosong Yang, Fanrong Zeng, William F. Cooke, and Shouwei Li employs a self-organizing map framework to uncover long-term climate predictability insights, showcasing the power of machine learning in addressing climate change.

00:00
00:00
~3 min • Beginner • English
Introduction
Sea level rise (SLR) poses significant risks to coastal communities through flooding, erosion, and saltwater intrusion, motivating a need for accurate predictions of coastal sea levels. Coastal sea level varies from hourly to centennial timescales, with seasonal-to-multidecadal variability linked to climate modes (e.g., ENSO) and large-scale ocean dynamics, notably the Atlantic Meridional Overturning Circulation (AMOC). The U.S. East Coast is a hotspot for accelerated SLR now and in projections, with sterodynamic sea level in the North Atlantic expected to rise rapidly in a warmer climate, especially north of Cape Hatteras, largely related to a weakened AMOC and associated current deceleration. Along the U.S. Southeast and Gulf coasts, rapid acceleration since 2010 suggests additional mechanisms beyond the AMOC-related Northeast response, including Gulf Stream strength/position changes, Florida Current warming, large-scale heat divergence linked to AMOC and low-frequency NAO, lagged responses to AMOC slowdowns, cumulative NAO/ENSO effects, and wind-forced Rossby waves with river discharge and coastal wind contributions. However, limited tide-gauge spatial coverage and short satellite records hinder characterization of decadal variability and predictability and understanding of mechanisms. To address this, the study leverages an unsupervised machine learning method—self-organizing maps (SOMs)—applied to very long preindustrial control simulations to classify North Atlantic SLA patterns, assess multiyear-to-decadal predictability via pattern transitions, and perform predictions using a SOM-based model-analog approach, comparing skill to initialized decadal hindcasts.
Literature Review
Prior studies identify the North Atlantic and U.S. East Coast as regions of enhanced SLR risk, with mechanisms including steric expansion, mass redistribution, and circulation changes tied to a weakened AMOC producing sharp sea level gradients across the Gulf Stream and enhanced Northeast U.S. SLR. Additional recent acceleration along the U.S. Southeast and Gulf coasts is attributed to multiple, not yet consensus, mechanisms: variations in Gulf Stream strength/position, Florida Current warming, large-scale ocean heat divergence linked to AMOC and low-frequency NAO, lagged responses to AMOC slowdown (2009–2010), combined NAO/ENSO effects, and tropical North Atlantic wind-forced Rossby waves with river discharge and coastal wind influences. Earlier work has shown predictable components of North Atlantic SLA related to AMOC phases and identified AMOC fingerprints in sea level along the U.S. East Coast. The literature also documents applications of SOMs in climate for circulation classification, teleconnections, variability diagnostics, and prediction, and model-analog methods for climate prediction in other basins, motivating their use here for North Atlantic SLA and decadal predictability.
Methodology
Models and datasets: Two 2500-year preindustrial control (piControl) simulations (years 201–2700 after omitting 200-year spin-up) from the GFDL SPEAR coupled model are used: SPEAR_LO (nominal 1° ocean/ice MOM6; AM4.0-LM4.0 atmosphere/land at 1°) and SPEAR_MED (same ocean but 0.5° atmosphere/land). All model data are linearly detrended to remove drift. A 30-member SPEAR large ensemble (SPEAR_MED, historical 1921–2014 plus SSP5–8.5 2015–2100) assesses external forcing effects on short-timescale variability. An observationally constrained SPEAR reanalysis (1961–2020; atmosphere nudged to JRA55 and SSTs restored to ERSSTv5) provides a benchmark for patterns and initialized decadal hindcasts (20-member, yearly starts 1961–2020, 10-year integrations with time-varying forcings). Satellite dynamic sea level anomalies (1993–2021; 0.25° Copernicus altimetry) and tide-gauge records (PSMSL; 1956–2023, detrended and 11-year running-mean low-pass filtered; 1961–2018 used) are employed. Self-organizing maps (SOM): SOMs classify annual mean North Atlantic SLA patterns (0–65°N, 100°W–0). Three 3×4 SOMs are trained: (1) unfiltered SLA; (2) 15-year high-pass Butterworth filtered SLA; (3) 15-year low-pass filtered SLA. For filtered SOMs, first/last 15 years are removed, yielding 4940 years. Training uses two passes with parameters: N=50 presentations per vector per training; Epanechnikov neighborhood function with decreasing radius (σ) and inverse learning rate α(n)=α/(1+100n), with σ=5, α=0.1 then σ=2, α=0.01 on second pass. SOM nodes approximate the input distribution, preserving topology (adjacent nodes similar). After training, each annual SLA is assigned to its best matching unit (BMU) by minimum Euclidean distance. Composites of SLA, SST, and AMOC overturning streamfunction are formed per node. Robustness to SOM size was tested; 3×4 captured dominant and transition patterns without over-merging. Transition probability framework: For each SOM, BMU time series (12 discrete states) are used to compute lagged conditional transition probabilities P_n=Pr{x_{t+n}=j | x_t=i} for lags 1–30 years, producing 12×12 transition tables per lag. Significance is assessed via Monte Carlo tests against unconditional node frequencies (100,000 resamples with sample sizes equal to average counts per node: 208 for unfiltered, 206 for high-pass, 73 for large-ensemble composites). Warm (cold) indicates probabilities above (below) the 97.5th (2.5th) percentile. Global significance (predictability) per initial node and lag is determined by field significance: if ≥3 locally significant transitions out of 12 at α_local=0.05, the node is globally significant at α_global=0.05, indicating theoretical predictability at that lag. SOM-based model-analog predictions: For each SOM node in the unfiltered-SLA SOM, each reanalysis year mapped to that node (by minimum Euclidean distance) is predicted by compositing the 10 most similar piControl analog SLAs within the same node and their subsequent evolutions at specified leads (Years 1, 4, 7, 10). Skill is compared to initialized 10-year SPEAR hindcasts initialized on January 1 of corresponding years, after removing lead-dependent climatology and linear trends. Pattern skill uses the area-weighted Anomaly Correlation Coefficient (ACC) over the North Atlantic. For coastal evaluation, tide-gauge series at selected Northeast U.S. stations are matched to closest model grid points, detrended, low-pass filtered (11-year running mean), associated to SOM nodes, and predicted using all piControl analogs within the matched node. Correlation skill significance accounts for autocorrelation via effective sample size. Observational mapping and forcing assessment: Satellite dynamic sea level anomalies are mapped to the high-pass SOM to assess resemblance. To test external forcing impacts on short-timescale SLA, 1993–2021 large-ensemble SLA patterns (with and without ensemble mean) are high-pass filtered and mapped to the high-pass SOM, and transition probabilities/persistence analyzed.
Key Findings
- SOM-derived dominant North Atlantic SLA patterns from two 2500-year piControl simulations are robust across SPEAR_LO and SPEAR_MED and closely linked to AMOC phases. Nodes [3,1] and [1,3] correspond to mature negative and positive AMOC phases, respectively, with coherent SLA fingerprints along the U.S. Northeast Coast; nodes [1,1] and [3,4] emphasize U.S. Southeast Coast loadings and represent AMOC transition states. - Transition probability analysis shows strong 1–3 year pattern persistence and preferential anticlockwise transitions among edge nodes across the SOM for up to ~15–20 years, tied to low-frequency buoyancy-driven AMOC evolution. Theoretical predictability (global significance threshold) averages about 16 years in SPEAR_LO and 21 years in SPEAR_MED. AMOC spectra peak at 25–40 years, with larger amplitude in SPEAR_MED, consistent with its higher predictability. - Low-pass (15-year) SOM patterns resemble unfiltered patterns, indicating that low-frequency variability dominates total SLA variance in mid-to-high latitudes (Gulf Stream/North Atlantic Current regions), whereas high-frequency variance concentrates in the tropics. - High-pass (15-year) SOM patterns exhibit basin-scale tripole-like structures consistent with NAO-like wind-driven thermosteric variability. Predictability features: (i) 1-year persistence likely from ocean memory and transient-eddy feedback; (ii) reemergence at 3–4 year lags likely from slow gyre adjustment (e.g., Rossby waves); (iii) overall predictability lasts 4–5 years, much shorter than unfiltered patterns. - Observations (1993–2021) mapped to the high-pass SOM show similar large-scale features but with larger magnitudes and finer structures along Western Boundary Current regions. Notably, node [1,1] occurs 27.6% of the time (8/29 years), with persistent classification during 2015–2021, indicating accelerated U.S. Southeast coastal sea level. This persistence is not reproduced by high-pass piControl or large-ensemble patterns (persistence after 5 years: piControl 2%/1% in LO/MED; large ensemble 4% with ensemble mean, 0% without), suggesting limited direct external forcing influence on short-timescale variability and implicating longer-timescale (AMOC-related) mechanisms in observations. - SOM-based model-analog predictions versus initialized hindcasts: Hindcasts generally outperform analogs at early leads due to initialization, but analogs can capture certain evolutions better for mature AMOC initial states (e.g., node [3,1] shows clearer Year-7 emergence of a narrow positive SLA band toward the U.S. Southeast in analogs). For node [3,4] (transition AMOC), analog skill is weaker than hindcasts after Year 7. Across nodes, hindcast skill exceeds analogs at early leads, but analogs maintain useful skill at longer leads for some nodes; nodes [1,2], [1,3], [2,1], [2,2], [3,1] retain ACC > 0.5 at 10-year lead. - Coastal validation along the U.S. Northeast: SOM-based analogs show significantly positive correlations with low-pass filtered tide-gauge records at several stations (e.g., Eastport, Bar Harbor, Portland, Boston, Woods Hole) during lead years 1–5 to 1–8; hindcasts maintain correlations >0.5 across all leads. Analog skill declines at longer leads, consistent with model AMOC variability timescales (25–40 years) shorter than observed/hindcast timescales. - Overall, machine learning (SOM) reveals multiyear-to-decadal predictability sources, quantifies their duration, and, combined with model-analogs, provides cost-effective long-lead predictions comparable in some cases to initialized hindcasts and useful for early guidance.
Discussion
The study addresses the challenge of predicting multiyear-to-decadal North Atlantic sea level by identifying and tracking recurrent large-scale SLA patterns and their preferred transitions, directly linking predictability to AMOC phase persistence and evolution. The SOM framework captures the continuum of internal variability states and reveals directional transitions (anticlockwise) indicative of dynamical progression through AMOC phases, producing 15–20 years of theoretical predictability in unfiltered SLA. When low-frequency AMOC signals are filtered out, short-timescale predictability remains but is limited to 4–5 years and exhibits a documented reemergence at 3–4 years, consistent with ocean memory and gyre adjustment dynamics. Observational mapping indicates that while short-timescale wind-driven features dominate the relatively short satellite record, persistent features (e.g., node [1,1] since 2015) likely involve longer-timescale AMOC-related mechanisms not fully isolated by high-pass filtering, emphasizing the need to consider both low- and high-frequency processes for observed predictability. In prediction applications, initialized hindcasts offer superior early-lead skill via initialization, whereas SOM-based model-analogs can provide skillful, computationally inexpensive long-lead guidance for specific initial states (notably mature AMOC phases), and outperform hindcasts in the first several lead years at some coastal tide-gauge sites when leveraging large-scale pattern information. These results underscore the practical value of ML-based pattern frameworks for diagnosing predictability sources, informing hybrid prediction strategies that integrate dynamical hindcasts with analog-based guidance, and supporting coastal risk planning.
Conclusion
This work develops and applies a SOM-based framework to (1) classify dominant North Atlantic SLA patterns in long coupled piControl simulations, (2) quantify multiyear-to-decadal predictability through lagged transition probabilities, and (3) perform decadal predictions using a SOM-based model-analog method benchmarked against initialized hindcasts and observations. Key contributions include documenting 15–20 years of theoretical predictability for unfiltered SLA linked to buoyancy-driven AMOC phases, identifying additional short-timescale predictability (1-year persistence and 3–4 year reemergence) after removing low-frequency signals, and demonstrating that model-analogs can yield useful, cost-effective long-lead predictions and coastal skill for certain initial states. Observational analyses suggest that both low- and high-frequency processes shape recent variability, including persistent elevated U.S. Southeast coastal sea level since 2015. Future work should: (i) train SOMs on models with AMOC variability timescales closer to observations to extend skill, (ii) increase ocean resolution and process realism, (iii) incorporate external processes affecting coastal sea level (glacial isostatic adjustment, vertical land motion, land ice melt, gravitational/rotational/deformational effects), and (iv) explore proxy data and observing strategies to better identify AMOC states conducive to higher predictability.
Limitations
- Observational constraints: Short satellite altimetry record limits direct characterization of low-frequency AMOC signals; tide gauges have sparse spatial coverage and are influenced by local processes. - Model dependence: Results rely on two SPEAR piControl simulations; AMOC amplitude/timescale biases (e.g., 25–40-year peaks shorter than observed) likely limit long-lead analog skill and pattern evolution realism. Atmospheric and ocean resolution differences may affect predictability; higher ocean resolution was not tested here. - External processes not represented: Many coastal processes (glacial isostatic adjustment, vertical land motion, land ice mass loss, gravitational/rotational/deformational effects) are not included, potentially affecting applicability to absolute coastal sea level. - Methodological choices: SOM size and 15-year filter cutoff may influence pattern separation and predictability metrics; transition probability significance assumes independence in local tests for global significance thresholding. Model-analog selection (top-10 within-node, Euclidean distance) may not be optimal in all states. - Forcing impacts: While analyses suggest limited direct external forcing role on short-timescale variability, model biases and differing sensitivities prevent ruling out forced contributions to observed persistence (e.g., node [1,1]).
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny