Earth Sciences

Real-time determination of earthquake focal mechanism via deep learning

W. Kuang, C. Yuan, et al.

Rapid and automated reporting of earthquake focal mechanisms is now possible with FMNet, a deep learning method that can predict focal mechanisms in real-time, showcasing its promise in regions lacking historical data. This groundbreaking research was conducted by Wenhuan Kuang, Congcong Yuan, and Jie Zhang.

00:00

~3 min • Beginner • English

Index

Introduction

Mitigating earthquake hazards requires rapid, automated reporting of source parameters. Beyond origin time, location, and magnitude, focal mechanisms provide essential information for fault geometry, regional stress inversion, Coulomb stress change calculations, and improving ground-motion prediction for early warning. Traditional focal mechanism solutions using P-wave first-motion polarities, moment tensor inversions, or waveform fitting typically require minutes to tens of minutes and extensive human oversight, limiting real-time utility. Although AI has advanced real-time seismology for detection, phase picking, and magnitude estimation, fully automated, practical, real-time focal mechanism determination remains challenging due to computational demands and data limitations. This study proposes FMNet, a deep convolutional neural network trained entirely on synthetics, enabling rapid, automated focal mechanism estimation directly from waveforms.

Literature Review

Conventional focal mechanism determination falls into three categories: (1) P-wave first-motion polarity methods, (2) moment tensor inversion, and (3) full waveform-based approaches. While broadly used, these methods are often time-consuming and sensitive to station coverage, velocity models, or manual inputs. Recent seismological studies suggest waveform information better constrains mechanisms than first-motion polarities alone. Prior AI applications in seismology have improved detection, picking, and magnitude estimation, and some studies combined deep learning-derived polarities with classical inversion (e.g., HASH) to refine catalogs. A fast search-engine method achieved sub-second estimates but requires impractically large search databases and data reformatting. These limitations motivate an end-to-end, waveform-based, deep learning solution that is fast, storage-light, and fully automated.

Methodology

Study area and data: The Ridgecrest region in southern California (∼100 km × 100 km, depth 2–20 km) was discretized into a 3D grid: 35.4°–36.2° latitude, −118.0° to −112.7° longitude, depth 2–20 km with steps of 0.1°, 0.1°, and 2 km, yielding 9 × 9 × 10 = 810 grid points. Three-component waveforms from 16 SCSN stations within 150 km were used for training generation and testing. Synthetic training set: Assuming a double-couple source and a 1D southern California velocity model, 3C waveforms were simulated with the Thompson–Haskell propagator matrix. Strike, dip, rake spanned 0–360°, 0–90°, and −90–90° with steps of 30°, 10°, and 20°, giving 12 × 9 × 9 = 972 mechanisms per grid. Total synthetics: 810 × 972 = 787,320. Each sample comprises 16 stations × 3 components (48 traces) with 128 s duration. Waveforms were amplitude-normalized and bandpassed (∼0.05–0.1 Hz). Realistic scenarios were emulated via added real noise with randomized SNR and random time shifts (<2 s) to mimic picking errors. An additional 1,000 synthetics formed a validation set. Preprocessing for real data: Upon automatic detection and phase picking, instrument responses are removed, bandpass filtering and P-arrival alignment are applied, and amplitudes normalized before input to FMNet. Network architecture: FMNet is a deep CNN with 16 trainable layers plus MaxPooling, UpSampling, LeakyReLU, and BatchNorm. Input size is 1 × 48 × 128 (channels × traces × time). The encoder compresses inputs to a 128 × 1 × 1 latent feature (exportable for interpretation); the decoder expands to 3 × 128 × 1 outputs representing Gaussian probability distributions over discretized strike, dip, and rake. Training: Formulated as a regression with Gaussian priors; weights/bias initialized from N(0,1). Loss: mean squared error. Optimizer: Adam. Learning rate 0.001; batch size 16; 50 epochs. Training used 4× NVIDIA Tesla V100 GPUs (~5 h). Convergence was monitored via training/validation losses and fit between true/predicted labels. Testing and error metrics: A separate unseen synthetic test set (~1,000 samples) spanning normal, strike-slip, and reverse mechanisms was predicted with the trained model. Errors were quantified via Kagan angle distributions. Additional robustness tests included perturbed velocity models (±10% per layer), missing-station scenarios, and events outside the study area, with prediction probabilities used to quantify reliability. Computation: After training, single-event prediction takes <200 ms on a single CPU; the deployed model requires only a few megabytes of memory. Interpretability via encoder: The encoder output (128 × 1 × 1) was used to compare L2-misfits in feature versus data domains. Smallest-misfit solutions in feature space matched those from waveform-domain comparisons, supporting that encoded features preserve essential waveform information for mechanism retrieval.

Key Findings

- Real-data case study: FMNet accurately estimated focal mechanisms for four Ridgecrest earthquakes (July 2019; Mw ≥ 5.4). Predicted solutions show predominantly steeply dipping strike-slip mechanisms, consistent with SCSN moment tensor catalog for three events; the northernmost event (not in SCSN) matched an independent gCAP inversion and aftershock patterns. - Waveform fit: Synthetic waveforms generated from FMNet-predicted mechanisms overlapped real data well across stations; average cross-correlation ≈ 0.86. - Generalization (synthetic test): On ~1,000 unseen synthetic events, 97.8% of Kagan angles were within 20%; ~2% exceeded 20%, likely due to nodal-plane equivalence ambiguities. - Speed and resources: After preprocessing, prediction is achieved in under ~200 ms on a single CPU; model storage is a few MB; training took ~5 h on 4× V100 GPUs. - Robustness and reliability: With a velocity model perturbed by up to 10%, typical errors were ~8° (dip) and ~20° (rake) with lower predicted probabilities. Missing-station scenarios with adequate azimuthal coverage retained stable probability distributions. Events outside the study area yielded much lower predicted probabilities (~0.6), providing a diagnostic for result reliability. - Probability outputs: Predicted probability distributions for strike, dip, and rake serve to quantify confidence, with lower probabilities indicating potential model–data mismatch (e.g., poor coverage, out-of-area events, or inaccurate velocity).

Discussion

FMNet directly addresses the need for fast, automated, and reliable focal mechanism estimation from waveform data. By training exclusively on synthetics and learning global waveform–mechanism mappings, FMNet eliminates the need for large empirical databases or time-consuming inversions. The approach enables deployment in regions with sparse historical catalogs, provided a sufficiently accurate velocity model can be constructed. Case studies in the Ridgecrest sequence confirm that FMNet’s predictions align with catalog and independent inversions, and waveform fits demonstrate physical consistency. The encoder analysis indicates that the network extracts compact, mechanism-informative features preserving waveform similarity under an L2 sense, enhancing interpretability and enabling rapid feature-domain searches if desired. Probability outputs offer a practical reliability measure that degrades under model mismatch, sparse coverage, or out-of-area events, guiding operational decision-making. Overall, FMNet has potential to integrate into automated monitoring and contribute to rapid ground-shaking assessments and early-warning systems by promptly providing mechanism information.

Conclusion

This work introduces FMNet, a deep learning framework that estimates earthquake focal mechanisms in real time from full waveforms, trained entirely on synthetic data. Applied to the 2019 Ridgecrest sequence, FMNet produced mechanisms consistent with catalogs and independent inversions, with high waveform similarity and sub-second inference on CPU. Synthetic tests showed low mechanism errors for diverse sources. The method’s compact model size and minimal computational demand make it suitable for operational, fully automated systems. Future directions include: (1) incorporating additional constraints (e.g., first-motion polarities) to mitigate nodal-plane ambiguity and improve rake estimates; (2) extending to higher frequencies and smaller events by using detailed 3D velocity models and broadband synthetics; (3) joint estimation strategies to better resolve focal depth; (4) further robustness to missing data and out-of-area events, potentially via domain adaptation; and (5) region-specific calibration to ensure velocity model adequacy and optimal performance.

Limitations

- Dependence on velocity models: Synthetic training relies on an accurate velocity model; errors (e.g., ±10% perturbations) degrade performance, particularly for rake and reduce prediction probabilities. - Frequency/content constraints: The use of a 1D model and low-frequency band limits application primarily to moderate-to-large events; smaller events (<Mw 5.4 in this study) often lack sufficient low-frequency signal or data quality for reliable testing. - Coverage and geography: Poor azimuthal coverage, missing stations, or events outside the training area reduce reliability; predicted probabilities drop and mechanism errors increase. - Ambiguity and components: Nodal-plane equivalence can lead to a small fraction of larger Kagan angle errors; rake is generally less well resolved than strike and dip. - Depth resolution: FMNet is not designed to recover focal depth robustly; depth estimation may require a subsequent grid search with the mechanism fixed. - Generalization: Deployment to new regions requires careful calibration, including velocity model development and station configuration adjustments.