Earth Sciences
A machine learning estimator trained on synthetic data for real-time earthquake ground-shaking predictions in Southern California
M. Monterrubio-velasco, S. Callaghan, et al.
Uncover the future of earthquake impact assessments! This study reveals how Machine Learning strategies, developed by Marisol Monterrubio-Velasco and colleagues, can significantly enhance ground shaking map estimations post-earthquake, making traditional methods a thing of the past.
~3 min • Beginner • English
Introduction
The study addresses the need for rapid and accurate estimation of earthquake ground shaking immediately after large events. Traditional empirical Ground Motion Models (GMMs) provide fast estimates of intensity measures such as PGA or PSA but can be limited by sparse observational catalogs, regional differences, and large variability, especially for large-magnitude events. Physics-based numerical simulations can capture complex phenomena (e.g., directivity, topography, basin/site effects) but are too computationally expensive for near real-time alerts. The authors propose a machine learning approach trained on physics-based simulations to retain GMM-like evaluation speed while capturing physics-based accuracy. The presented ML-based Estimator for ground-shaking maps (MLESmap) is trained on CyberShake Study 15.4 synthetic data for Southern California to predict RotD50 intensity measures from readily available early-event parameters (location and magnitude), aiming to improve real-time shaking maps.
Literature Review
Empirical GMMs (e.g., NGA-West2 models such as ASK-14) are widely used for rapid intensity estimation but are constrained by data sparsity, regional variability, and limited representation of large events. Recent work has applied machine learning to ground motion prediction using observed data to build ML-based GMMs or synthesize time histories and predict damage; however, availability and quality of observational data, especially for large magnitudes, can limit applicability and ML models often extrapolate poorly. Withers et al. (2020) trained an ANN GMM on CyberShake Study 15.12 synthetics using NGA-West2-style predictor variables (e.g., rupture and site parameters such as Vs30, Z1.0, Z2.5), demonstrating promise for data-poor regions. The present work differs by using a large CSS-15.4 synthetic database and restricting predictors to elementary, rapidly available parameters, relying on the ML models to implicitly learn complex site and propagation effects.
Methodology
Data: The study uses the CyberShake Study 15.4 (CSS-15.4) database for Southern California, a large set of 3D physics-based simulations based on seismic reciprocity and Strain Green Tensors (SGTs). CSS-15.4 performs physics-based PSHA at 1 Hz using the CVM-S4.26-M01 velocity model, GPU AWP-ODC-SGT, the GP-14 kinematic rupture method with uniform hypocenters, and the UCERF2 earthquake rupture forecast. The study required 37.6 million core hours. The database contains intensity measures for 153,628 hypothetical scenarios recorded at 253 sites. Magnitudes predominantly cluster around Mw 7.6 with a cutoff at ~6.5, though variability introduces some lower magnitudes.
Target and predictors: The target is log(RotD50) pseudo-spectral acceleration at four periods (T = 2, 3, 5, 10 s). Predictors comprise eight features: earthquake magnitude; hypocentral latitude, longitude, and depth; site latitude and longitude; and the Euclidean distance and azimuth from hypocenter to site.
Model training: Two supervised ML algorithms were used: Random Forest (RF) and Deep Neural Networks (DNN). Eight independent models (two algorithms × four periods) were trained. Data were split 90% for training and 10% for validation/testing. Considering 253 stations, approximately 155 million event realizations were used for training and 3.8 million for validation. Targets were log-transformed to improve performance.
Random Forest: Trained using the dislib library on PyCOMPSs to handle large datasets efficiently. Hyperparameters were tuned via grid search and k-fold cross-validation, optimizing R². Best hyperparameters for all periods: maximum tree depth dmax = 30, number of estimators n = 30, and try-features t = 'third'.
Deep Neural Network: Implemented as fully connected multilayer perceptrons in TensorFlow/Keras. Architectural exploration led to MLPs with either seven hidden layers (32, 64, 128, 256, 128, 64, 32 units) or nine hidden layers (16, 32, 64, 128, 256, 128, 64, 32, 16 units). Regularization strategies included data normalization, batch normalization (pre-activation), dropout experimentation, and learning rate schedulers (warm anneal). Softplus activation was used for hidden layers and sigmoid for output (targets normalized to [0,1]).
Evaluation metrics: Performance was assessed with MAE, MSE, RMSE, MAPE, R², and Pearson correlation on validation synthetics. For comparisons with empirical GMMs, RMSE of log(RotD50) and Aida’s K (geometric mean ratio) were used to assess accuracy and bias across 0.1-magnitude bins. Comparative benchmark: ASK-14 GMM computed using OpenSHA with Thompson Vs30 values.
Real-event validation: Five historical Southern California earthquakes were used: 1992 Landers (Mw 7.3), 1999 Hector Mine (Mw 7.1), 1994 Northridge (Mw 6.7), 1986 North Palm Springs (Mw 6.1), and 1987 Whittier (Mw 5.9). Observations were taken from the SCEC Broadband Platform stations. Stations were partitioned into 'inside' (within Q1 of inter-site distances of training sites) and 'outside' groups to assess spatial extrapolation effects. RF, DNN, and ASK-14 predictions were evaluated at station locations for T = 2, 3, 5, and 10 s using RMSE, with percent improvement relative to ASK-14 reported.
Key Findings
- On synthetics, both RF and DNN models achieved strong performance with average R² ≈ 0.86 and MAPE < 15%. Histograms of predicted RotD50 matched synthetic references across magnitude bins, with low median RMSE.
- Compared to ASK-14 on synthetic events, MLESmap (RF and DNN) reduced median RMSE of log(RotD50) by up to 45% across magnitude bins and periods (T = 2–10 s). RF/DNN showed no systematic bias in Aida’s K, whereas ASK-14 tended to underpredict for Mw > 6.5 and overpredict for Mw < 6.5.
- Spatial RotD50 maps for validation synthetics (e.g., Mw 6.85) showed MLESmap reproduces reference spatial patterns and amplitudes better than ASK-14, which consistently underestimated RotD50.
- For real earthquakes at 'inside' stations, MLESmap outperformed ASK-14 for events within the training magnitude range: RF improvements of roughly 14–73% and DNN improvements of 19–88% in RMSE, depending on event and period. For the Mw 6.1 North Palm Springs event, very large improvements over ASK-14 were observed in several periods. For Mw 5.9 Whittier (outside training magnitude range), ML models underperformed relative to ASK-14 (negative percent improvements), illustrating poor extrapolation.
- MLESmap maps displayed more realistic spatial variability and local amplification (especially for Mw > 7 events) than ASK-14, reflecting site and basin effects embedded in the physics-based synthetics.
- Inference time is comparable to empirical GMM evaluations, enabling near real-time application with only rapid, elementary inputs (event location and magnitude).
Discussion
The findings demonstrate that ML models trained on high-quality physics-based synthetic databases can deliver rapid and more accurate ground motion intensity estimates than empirical GMMs when applied within the bounds of the training data. This addresses the speed-accuracy trade-off inherent to real-time response: MLESmap retains GMM-like evaluation speed while capturing complex physical effects present in simulations. The superior performance against ASK-14 on both synthetics and observed records for compatible events suggests the simulations capture key physics of Southern California ground motions, and that models can implicitly learn site and path effects from simple inputs. However, performance degrades when extrapolating beyond the training domain in magnitude or space, as evidenced by the Whittier event and 'outside' stations. The results support integrating ML-based models with empirical GMMs in hybrid workflows—using ML where in-domain and falling back on GMMs elsewhere. Including additional quickly derivable features (e.g., focal mechanism, rupture extent) may further reduce uncertainty if reliably available in real time. The approach is promising for rapid impact assessment, uncertainty quantification, and operational earthquake forecasting, particularly as improved synthetic databases and computational resources become more accessible.
Conclusion
The study introduces MLESmap, a machine learning-based estimator trained on physics-based CyberShake CSS-15.4 synthetics to predict RotD50 maps in Southern California from rapidly available event parameters. RF and DNN models achieve high predictive skill and outperform the ASK-14 empirical GMM with up to 45% lower median RMSE on synthetic events and 11–88% lower RMSE for several historical earthquakes at stations within the training domain and magnitude range. The method delivers near real-time estimates while implicitly capturing complex physical effects. Future work includes training with updated and richer synthetic databases (including higher frequencies and dynamic rupture variability), extending to other regions with tailored databases, incorporating additional rapid inputs (e.g., focal mechanism), exploring hybrid ML–GMM strategies and transfer learning to mitigate extrapolation limitations, and applying the approach to rapid PSHA and uncertainty quantification.
Limitations
- Generalization is limited to the parameter space of the training data (event magnitudes, frequencies, and spatial domain). Performance deteriorates for magnitudes below the CyberShake cutoff (e.g., Mw < 6.0) and at stations far from the training grid.
- ML models extrapolate poorly; the Whittier (Mw 5.9) case and 'outside' stations show underperformance relative to ASK-14.
- Current implementation targets RotD50 only and does not predict temporal characteristics (e.g., travel times, duration, phases).
- CSS-15.4 simulations are limited to 1 Hz and use pre-set kinematic rupture parameters; limited high-frequency content and rupture variability may constrain applicability at shorter periods or for different rupture complexities.
- Accuracy depends on the fidelity of the underlying velocity models, rupture generators, and site characterization in the synthetic database.
- Inputs are limited to elementary rapid parameters; not including mechanism or detailed site metrics may limit accuracy for certain scenarios.
Related Publications
Explore these studies to deepen your understanding of the subject.

