Earth Sciences

Tremor clustering reveals pre-eruptive signals and evolution of the 2021 Geldingadalir eruption of the Fagradalsfjall Fires, Iceland

Z. Zali, S. M. Mousavi, et al.

Explore the groundbreaking analysis of Iceland's 2021 Geldingadalir eruption using unsupervised machine learning, led by researchers Zahra Zali and colleagues. This innovative study uncovers distinct seismic signal clusters and reveals a precursory volcanic tremor that could signify impending eruptions.

00:00

~3 min • Beginner • English

Index

Introduction

The study addresses the challenge of forecasting volcanic eruptions given diverse pre-eruptive behaviors. Seismic signals are crucial for assessing eruptive activity, identifying pre-eruptive stages, and potential precursors such as swarms, repeating earthquakes, and volcanic tremor. Despite their value, both earthquake swarms and tremor can appear during non-eruptive periods, complicating forecasting. The 2021 Geldingadalir eruption on Iceland’s Reykjanes Peninsula followed intense seismicity and intrusion, yet no precursory tremor had been reported. The research question is whether unsupervised deep learning applied to continuous seismic data can reveal hidden patterns, including subtle pre-eruptive tremor, and delineate the eruption’s temporal evolution. The purpose is to leverage deep embedded clustering to automatically identify seismic signal clusters corresponding to distinct phases (unrest, lava outflow, lava fountaining) and to explore their implications for eruption processes and potential early warning.

Literature Review

Prior work highlights the use of seismic methods to detect magma movement and eruption precursors, including tremor, swarms, and repeating earthquakes. Eruption precursors should exhibit recurrency, transferability, and differentiability. In Iceland, recent eruptions typically showed short-term seismic indicators (earthquakes and/or tremor). However, swarms and tremor can occur without eruptions, and tremor may be masked by swarm energy, challenging its direct use in forecasting. Some eruptions lack apparent precursory tremor. Machine learning has been used for classifying discrete volcano-seismic events, but unsupervised approaches on continuous data remain less explored. Deep clustering with autoencoders (DEC) has proven effective in other seismological contexts, suggesting potential for uncovering subtle volcanic patterns without labeled data.

Methodology

Data and preprocessing: Continuous seismic data from the east component of station NUPH (9F seismic network), 5.5 km SE of the 2021 Geldingadalir eruption site, were used from 12 March to 24 June 2021. The east component was chosen due to higher tremor SNR on horizontal components. Seismograms were demeaned, detrended, resampled to 8 Hz, and bandpass filtered to 1–4 Hz. Input representation: For continuous monitoring, Short Time Fourier Transform (STFT) spectrograms of one-hour windows were computed and used as inputs. The input/output size for the autoencoder was 96 frequency bins by 128 time bins; the latent (bottleneck) dimension was 24. Deep embedded clustering (DEC): An autoencoder was first pre-trained to reconstruct inputs from latent features using mean squared error loss. The encoder comprised four 2D convolutional layers followed by a dense layer; the decoder used a dense layer followed by four transposed convolutional layers. ELU activations were used except for a linear output layer. After pre-training, k-means clustering was applied to the latent space to initialize cluster centroids. Fine-tuning simultaneously optimized feature learning and cluster assignment by minimizing KL divergence between soft assignment probabilities (computed via Student’s t-distribution similarity) and a target distribution that emphasizes high-confidence assignments. The clustering loss was weighted with A = 0.1. Optimization used stochastic gradient descent with momentum 0.9 and learning rate 0.01; autoencoder weights were updated every 200 iterations and training stopped when assignment changes fell below 0.01% of samples. The number of clusters k was selected by the Calinski–Harabasz index (elbow at k=4). Visualization of latent data used t-SNE. Precursory tremor extraction: To verify subtle pre-eruptive tremor identified by DEC, a harmonic–percussive separation algorithm was applied to isolate continuous tremor from transient earthquake energy in mid-March. Clustering episodic tremor: For detailed analysis of lava fountaining episodes (2 May–14 June), STFTs of seven-minute windows starting at episode onsets (per Eibl et al. catalog) were used as inputs. Inputs were 32 frequency bins by 48 time bins (1536 values) with a latent dimension of 3. The autoencoder for episodes used MAE loss (decay rate 0.9). The Calinski–Harabasz index supported k=4 for episodes. DEC fine-tuning and t-SNE visualization were performed as above. Model selection and validation: The CH index guided cluster number selection; cluster separation improved after DEC fine-tuning. More than 99% of samples had assignment likelihood >0.99. Sensitivity analyses varying k for pre-eruptive-only windows showed CT1 stability and EQ sub-clustering by earthquake amplitude/occurrence rate.

Key Findings

- Four primary clusters corresponding to eruption phases were identified from continuous data: EQ (earthquakes), CT1 (continuous tremor 1), ET (episodic tremor), and CT2 (continuous tremor 2). - EQ: Dominated by transient earthquakes, prevalent 12–19 March (pre-eruption) with some occurrences through 14 June. - CT1: Continuous tremor begins three days before eruption onset (starts at noon on 16 March) and lasts to 27 April. Characterized by narrowband tremor with strongest frequency near 2.5 Hz. - Verified precursory tremor: Harmonic–percussive separation revealed a continuous tremor signal starting 16 March despite masking by swarms, confirming DEC’s detection of weak pre-eruptive tremor that had been previously overlooked. - ET: Episodic tremors associated with lava fountaining span 27 April to 13 June. DEC detected a subtle onset of episodicity at 05:00 on 27 April, earlier than the clear visual patterns from 2 May. - CT2: From 13 to 24 June, continuous tremor with two dominant bands: narrowband near 1.2 Hz and a broader, weaker band between 2–3 Hz; amplitudes higher than CT1. - Interpretation: The transition from CT1 to ET around 27 April suggests lava fountaining was triggered by increased magma discharge, aligning with independent observations of a mid-April shift to deeper magma sources and increased effusion rates. - Episodic tremor clustering (ET-1 to ET-4) identified system changes on 5, 11, 17 May, and 10 June: • ET-1: Predominant 2–5 May 11:56; duration ≥7 min; dominant frequency ~1.2 Hz; lower amplitude. • ET-2: 2–11 May 12:54 (with some episodes to 17 May and 10–13 June); mean duration ~4.5 min; similar frequency but higher amplitude than ET-1. • ET-3: From 11 May 15:55 mainly to 10 June (some to 13 June); mean duration ~2.8 min; fundamental ~1.4 Hz, first overtone ~2.8 Hz; highest amplitudes. • ET-4: Mean duration ~3.3 min; frequency similar to ET-3 but lower amplitude; dominates after 10 June 03:39; some earlier episodes present, potentially influenced by wind noise reducing SNR. - The evolution of episode characteristics is consistent with an enlarging and stabilizing shallow magma compartment, conduit widening, and changes after a major crater collapse on 10 June. - Clustering quality: t-SNE showed clear separation post fine-tuning; >99% of samples had assignment likelihood >0.99.

Discussion

The findings demonstrate that unsupervised deep embedded clustering can automatically extract and chronologically organize seismic signal patterns tied to key volcanic phases. DEC successfully detected weak pre-eruptive tremor obscured by swarms, as well as early subtle episodic patterns preceding visible lava fountaining, directly addressing the study’s goal of revealing hidden precursors and system evolution. These results underscore the potential of feature-learning approaches to enhance situational awareness and contribute to short-term eruption assessment when labeled data are scarce. Practical considerations include station proximity and data continuity, critical for capturing weak tremor signals. While the recurrence of known clusters could inform interpretations in near-real-time monitoring, robust application requires long-term observations to learn the system’s repertoire of states. Model performance can be impacted by non-volcanic noise variability, and the preselection of cluster number limits responsiveness to previously unseen patterns. Volcano-specific adaptations (frequency band, time resolution, training data) are necessary. Overall, the method offers a fast, reproducible way to track volcanic system evolution and complements traditional analyses, but it is not yet ready for operational warning systems without further validation.

Conclusion

This work introduces an unsupervised deep embedded clustering workflow that, from single-station continuous seismic data, reveals the temporal evolution of the 2021 Geldingadalir eruption and identifies a previously unreported pre-eruptive tremor beginning on 16 March 2021. Four clusters (EQ, CT1, ET, CT2) delineate unrest, continuous lava outflow, episodic fountaining, and late-stage continuous tremor, while episodic tremor sub-clusters (ET-1 to ET-4) expose system changes on 5, 11, 17 May, and 10 June. The transition to episodicity detected on 27 April suggests increased discharge coupled with deeper magma input as drivers of the shift to lava fountaining. The approach highlights the power of learned features from raw spectrograms to detect subtle volcanic signals and provide chronology without manual labeling. Future work should expand cross-volcano applications, refine real-time implementations, link learned features to physical properties, handle dynamic noise environments, and evaluate strategies for adaptive clustering to accommodate previously unseen activity states.

Limitations

- Weak precursory tremor can be masked by intense earthquake swarms; detection depends on station proximity and data availability before and during eruptions. - Limited input duration (e.g., only eight pre-eruptive days) reduces clustering precision and can cause misclustered samples. - Variable and strong environmental noise (e.g., wind) degrades SNR and can bias cluster assignments (noted for ET-4 before 10 June). - The number of clusters is chosen a priori and based on previously seen data, limiting responsiveness to entirely new patterns in real-time. - Models need volcano-specific training (frequency band, time resolution), reducing direct transferability. - Not yet mature for operational early-warning without further validation on well-documented eruptions and across diverse volcanic settings.

Related Publications

Explore these studies to deepen your understanding of the subject.

Earth Sciences

Magmatic plumbing and dynamic evolution of the 2021 La Palma eruption

C. D. Fresno, S. Cesca, et al.

Social Work

Community-scale big data reveals disparate impacts of the Texas winter storm of 2021 and its managed power outage

C. Lee, M. Maron, et al.

Earth Sciences

Deformation and seismicity decline before the 2021 Fagradalsfjall eruption

F. Sigmundsson, M. Parks, et al.

Earth Sciences

Diverse mantle components with invariant oxygen isotopes in the 2021 Fagradalsfjall eruption, Iceland

I. N. Bindeman, F. M. Deegan, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny