Earth Sciences
Machine learning-based tsunami inundation prediction derived from offshore observations
I. E. Mulia, N. Ueda, et al.
This groundbreaking study, conducted by Iyan E. Mulia, Naonori Ueda, Takemasa Miyoshi, Aditya Riadi Gusman, and Kenji Satake, pioneers a real-time tsunami inundation prediction method leveraging machine learning and North Japan’s S-net data. With an astounding 99% reduction in computational costs, this model provides vital lead time in forecasts and addresses uncertainties in tsunami source estimations.
~3 min • Beginner • English
Introduction
The study addresses the need for rapid and accurate near-field tsunami inundation predictions to support timely evacuations and emergency response. Motivated by the 2011 Tohoku-oki tsunami and leveraging Japan’s S-net, a dense cabled seafloor observatory of ocean-bottom pressure sensors and seismometers, the authors aim to predict spatial distributions of tsunami flow depths directly from offshore observations without relying on uncertain and time-consuming source inversions. Traditional physics-based inundation modeling solves nonlinear shallow water equations with high computational demand and depends on source estimate quality, limiting real-time applicability. The research proposes a machine learning approach that maps offshore tsunami signals observed at 150 S-net stations to high-resolution inundation maps for seven cities along the southern Sanriku coast, seeking comparable accuracy to physics-based models with drastically reduced computation time and increased forecast lead time.
Literature Review
Prior machine learning efforts for tsunami inundation forecasting (e.g., Fauzi and Mizutani, 2020; Mulia et al., 2020) transformed linear shallow water simulations into nonlinear inundation maps but still required source estimation and linear simulations due to limited offshore observations. More recent studies (Makinoshima et al., 2021; Liu et al., 2021) used offshore data to predict water level time series at limited sites, offering detailed temporal variation but becoming computationally heavy when extended to many locations. Conventional real-time systems (e.g., NOAA) are optimized for far-field events with site-specific inundation models, data assimilation, or source inversion approaches. The availability of S-net enables direct offshore observation-based forecasting to bypass source-estimate uncertainties and accelerate predictions. The work builds upon stochastic slip modeling, tsunami Green’s functions, and nested-grid physics-based simulators (e.g., JAGURS) used for training data generation and evaluation, while addressing gaps in near-field, wide-area, high-resolution real-time inundation prediction.
Methodology
- Study area and targets: Seven cities along the southern Sanriku coast (Sanriku, Ofunato, Rikuzentakata, Kesennuma, Motoyoshi, Minamisanriku, Oppa). Output domain at 30 m grid resolution comprising 214,356 inundation grid points (flattened vector).
- Data generation: 3093 training scenarios consisting of 3060 megathrust events (Mw 8.0–9.1 in 0.1 increments) and 33 outer-rise events. Test set: 480 hypothetical megathrust scenarios (40 per magnitude for 12 magnitudes). Real-event applications: 2011 Tohoku-oki (Mw 9.0), 1896 Meiji Sanriku (Mw 8.1), and 1933 Showa Sanriku (Mw 8.5) using independently estimated sources from prior studies.
- Megathrust sources: Plate interface discretized into 240 curvilinear patches guided by SLAB2; stochastic heterogeneous slips generated via a spatial random field with von Karman autocorrelation, moment-preserving mapping to patches; coseismic displacements via triangular dislocation in a half-space; instantaneous deformation and long-wave approximation to initiate tsunamis.
- Outer-rise sources: 33 identified normal faults from prior seismic studies; uniform slip assumption (heterogeneity contributes only ~10–15% to variability); coseismic displacements computed analogously.
- Physics-based simulations: JAGURS code; four-level nested grids (30, 10, 3.33, 1.11 arcsec), 3-hour runs, 0.5 s timestep, Manning n=0.025 s/m^{1/3}; moving boundary for inundation; crustal deformation included; bathymetry/topography from GEBCO_2020, JHA M7005, and JAXA AW3D30.
- Green’s functions and input variability: Green’s functions built for 240 subfaults; linear superposition used to synthesize bottom pressure station waveforms for many scenarios efficiently; for variability analysis, 500 stochastic scenarios per magnitude; inputs taken as maximum amplitudes at 150 S-net stations within 3–20 min post-earthquake (first 3 min omitted to avoid seismic contamination). Coefficient of variation analysis identified required scenario counts by magnitude (e.g., ~310 for Mw 8.0 and ~200 for Mw 9.1; linearly interpolated for others) leading to ~3060 megathrust scenarios.
- Neural network model: Fully connected feedforward network with 150 input neurons (one per S-net station), two hidden layers of 150 neurons each, ReLU activations, 20% dropout in hidden layers, ReLU in output to enforce non-negativity. Output layer has 214,356 neurons corresponding to maximum inundation flow depths on target grid points; for smaller events, out-of-extent points set to zero. Training with Adam optimizer (He normal initialization), batch size 20, mean squared error loss, 2000 epochs, TensorFlow implementation. Final model selected by minimum error on training and test sets.
- Prediction windows and inputs: Predictions updated with available data using windows of 10, 15, and 20 minutes after origin time; inputs considered as maximum or mean amplitude across the window; main analyses emphasize maximum amplitudes at 20 minutes.
- Evaluation metrics: Misfit (O_i - S_i) and relative misfit; Aida’s numbers (geometric mean ratio K and its standard deviation κ/χ) with acceptance criteria approximately 0.8 < K < 1.2 and χ < 1.4; accuracy (%) derived from K; goodness-of-fit statistic G (0 to 1; lower is better) with weights emphasizing higher flow depths; evaluations generally at locations with observed flow depths ≥ 0.2 m, and stratified by depth ranges (0.2–1 m, 1–3 m, >3 m) for some analyses.
- Robustness to station failures: Synthetic experiments randomly zeroed inputs at 10–140 stations (100 random draws each) and by failing single S-net segments (1–6) to assess accuracy degradation.
- Uncertainty analysis: Input perturbations sampled from Gaussian noise with standard deviations 0.01 m and 0.1 m applied to the 2011 event inputs; 100 ensemble predictions produced to quantify output variability.
- Computation time: Training ~65 minutes on NVIDIA GeForce RTX 2070 GPU. Real-time inference benchmarked on CPU (Intel i7-10875H, 16 GB) versus physics-based inundation simulation: ~0.05 s for ML vs ~30 minutes for physics-based (no parallelization).
Key Findings
- Test set performance (20-min window, maximum amplitudes): Aida’s numbers K=1.02, χ=1.47; accuracy 98.0%. For 10- and 15-min windows (maximum amplitudes): accuracies 92.6% and 93.5% with K=1.08 (χ=1.49) and 1.07 (χ=1.50), respectively. Using mean amplitudes yields comparable but generally lower accuracy, especially at 10 min (82.0%).
- Misfit distribution is approximately normal with median near zero, indicating unbiased predictions; goodness-of-fit G medians between 0.004 and 0.086 across magnitudes; fits are worse for smaller magnitudes due to higher variability.
- Spatial variability: Higher misfits near complex coastlines and steep cliffs (e.g., Ryori Bay) with standard deviation of misfits exceeding 1.5 m locally, but relative misfits remain small where expected depths are high (~10 m), reflecting local resonance effects.
- Robustness to station failures: With 10 malfunctioning stations, median accuracy remains near the baseline (~98%) but extreme outliers can drop to 70.7%, indicating susceptibility to inconsistent inputs. Even with 140 failed stations, some combinations still achieve ~84.8% accuracy. Failure of a single S-net segment generally retains ≥84% accuracy, except segment 3, which degrades to 66%, indicating its criticality for this region.
- Real events:
• 2011 Tohoku-oki (Mw 9.0): ML vs observations K(χ)=0.99(1.39), comparable to physics-based 0.96(1.40). Spatially, ML underestimates by ~2–3 m in parts of Rikuzentakata and Minamisanriku, but larger depths (>1 m) are predicted more accurately.
• 1896 Meiji Sanriku (Mw 8.1): ML K(χ)=0.96(1.41) vs physics 1.03(1.39). Overall accuracy lower than 2011, likely due to simplified source geometry and less reliable historical data.
• 1933 Showa Sanriku (Mw 8.5, outer-rise): ML K(χ)=0.92(1.81) vs physics 0.95(1.59); performance limited by few outer-rise scenarios in training and simple uniform slip source for reference model.
- Uncertainty due to observation errors: Input perturbation σ=0.01 m yields output standard deviation up to ~0.2 m; σ=0.1 m yields up to ~1.5 m, evidencing nonlinear input-output sensitivity.
- Computational efficiency: Real-time inundation prediction obtained in ~0.05 s on CPU vs ~30 minutes for physics-based, representing ~99% computational cost reduction. Total operational latency primarily equals the chosen data window (~20 min), with inference time negligible.
Discussion
The proposed machine learning model directly maps offshore tsunami signals to high-resolution inundation flow depths across multiple cities, effectively addressing the time constraints of near-field tsunami forecasting where wave celerity challenges traditional modeling. Results show that within 20 minutes of data, the model provides near-physics-based accuracy with vastly reduced computation, enabling rapid updates as new data arrive and facilitating ensemble-based uncertainty quantification. Performance improves for larger tsunamis, aligning with early warning priorities. Spatial misfit patterns highlight challenges in complex coastal geometries where nonlinear local effects (e.g., bay resonance) can introduce higher residuals, though relative errors may remain small in high-depth regions. Robustness tests suggest the system tolerates certain levels of sensor failure, but critical network segments (e.g., S-net segment 3) significantly influence accuracy, guiding maintenance and redundancy planning. Real-event applications confirm comparable accuracy to physics-based methods despite differences in source modeling frameworks, with discrepancies attributable to source model quality, geometry, and training coverage (especially for outer-rise events). The approach’s speed and scalability make it suitable for integration into operational systems, potentially combined with optimized station subsets, additional sensors (seismic/geodetic), and probabilistic frameworks to better quantify total predictive uncertainty.
Conclusion
This work demonstrates a real-time, offshore observation-driven machine learning surrogate for tsunami inundation prediction over wide coastal areas at 30-m resolution. Trained on 3093 simulated scenarios and validated on 480 unseen megathrust scenarios and three historical events, the model achieves accuracy comparable to physics-based simulations while reducing real-time computation by ~99%. Direct use of offshore data increases lead time and avoids uncertainties inherent in source inversion. The method scales to multiple cities without proportional increases in inference time and enables rapid ensemble-based uncertainty assessment. Future work should expand training datasets (especially outer-rise scenarios), incorporate additional observation types and arrival-time information, optimize sensor subsets, refine network architectures and hyperparameters, and integrate with probabilistic tsunami hazard frameworks to quantify comprehensive prediction uncertainty. Broader deployment can leverage cloud-based infrastructures and support operational early warning in regions with existing or emerging offshore observation networks.
Limitations
- Limited real near-field tsunami records from S-net constrain validation against actual events.
- Sensitivity to input inconsistencies: large fractions of malfunctioning stations can cause spurious predictions; certain network segments (e.g., S-net segment 3) are critical.
- Training scenario coverage: relatively few outer-rise scenarios reduce performance for such events; unprecedented events outside the training distribution may degrade accuracy.
- Spatially complex coastal regions (e.g., bays with resonance, steep coastlines) exhibit higher localized misfits due to strong nonlinearities.
- Uncertainty quantification not exhaustive: model uncertainty from varying initial weights/architectures and full epistemic/aleatory components are not fully captured.
- Region-specific reliance on dense offshore networks (S-net); transferability to sparse networks may require adapted inputs and model redesign.
Related Publications
Explore these studies to deepen your understanding of the subject.

