logo
ResearchBunny Logo
Introduction
Accurate subsurface velocity models are essential for various applications, including geophysical exploration and CO2 monitoring. Traditional methods like normal moveout correction, tomographic inversion, and full waveform inversion, while effective in some settings, struggle with complex geology (salt diapers, basalt formations, karst systems). These limitations hinder seismic imaging and reservoir understanding. Deep learning (DL), employing multi-layered neural networks, offers a potential solution by learning complex, non-linear relationships between input data (seismic data) and output labels (velocity models). DL has successfully tackled inverse problems in various fields, including medical imaging and geophysics. However, its application to velocity model building from field data in complex geological settings remains largely unexplored, with most existing research focusing on synthetic data or simpler geological regimes. This paper innovatively applies DL to velocity model inversion using field-recorded seismic data from a geologically complex region in the U.S. Gulf of Mexico. It extends learning-based inversion from synthetic to real-world data, demonstrating its applicability to industrial-scale problems in complex geological settings. This addresses the need for cost-effective and efficient imaging of valuable reservoirs, important not only for hydrocarbon exploration but also for carbon sequestration, hydrogen storage, and geothermal prospecting. The study also provides a comprehensive comparison between models trained on synthetic and field data, highlighting the advantages and challenges of each approach and emphasizing the role of sophisticated synthetic data generation in bridging the domain gap between synthetic training and real-world application. This work further demonstrates the potential of repurposing existing seismic velocity models as labeled data for training machine learning models, addressing industrial-sized problems and potentially revolutionizing the cost dynamics of subsurface reservoir imaging.
Literature Review
Conventional velocity model-building methods, such as normal moveout correction, tomographic inversion, and full waveform inversion, have limitations in complex geological settings due to linearized inversion and simplifying assumptions. Deep learning (DL) has emerged as a promising alternative for solving inverse problems in various fields. Several studies have applied DL to seismic velocity inversion, but mostly using synthetic data or simpler geological regimes. This study builds upon this prior work by focusing on the application of DL to real-world field data in a complex geological setting, thereby addressing a significant gap in the current literature. The authors cite previous research applying DL to inverse problems in various fields, including medical imaging and geophysics, and also cite research using DL for seismic inversion using synthetic data. The paper positions itself as extending these existing methods by successfully applying them to real-world seismic data from a complex geological environment.
Methodology
The study utilizes a seismic dataset from the Tiber field in the Gulf of Mexico, known for its complex overburden, which includes a legacy velocity model for comparison. The dataset was preprocessed using standard industry practices. A learning-based inversion approach employing a convolutional neural network (CNN) is adopted. The CNN acts as a function approximating the inverse of the seismic experiment, mapping seismic shot gathers (input features) to velocity models (output labels). The network architecture is an encoder-decoder design with convolutional layers, designed to capture spatial correlations in seismic data. Input features consist of adjacent 2D seismic shot gathers forming a 3D tensor, while output labels are 2D velocity model slices. The data are spatially divided into training, validation, and test folds, mimicking the non-random nature of seismic data acquisition. Multiple training datasets are used: one using field data, and others using synthetic data generated from the acoustic and elastic wave equations. Synthetic datasets were enhanced by incorporating improved sediment reflections and geologic priors. Data augmentation techniques, including horizontal flipping, random bandpassing, and 2D wave-propagation correction, were applied to enhance model generalization. The performance of the DL models is evaluated using quantitative metrics (MSE, SSIM, R-squared), qualitative visual comparison with the legacy model, and geophysical comparison using reverse time migration (RTM) of seismic data with the predicted velocity models as input. The models were trained using an Adam optimizer, MSE loss function, and specific learning rate and batch size parameters. The training process involved several epochs, with computational time dependent on factors such as data augmentation and GPU resources. Several augmentation techniques (horizontal flipping, random bandpassing, and 2D wave-propagation correction) were applied to the training datasets to mitigate overfitting and improve model generalization.
Key Findings
Models trained on field data significantly outperformed those trained on synthetic data across all evaluation metrics (MSE, SSIM, R-squared). The field data model achieved an R-squared score of 0.909, indicating an excellent fit to the legacy velocity model. Models trained on synthetic data generated using the elastic wave equation performed better than those using the acoustic wave equation. Incorporating enhanced sediment geological priors in the synthetic data further improved model performance. Data augmentation techniques had a nuanced impact, sometimes improving performance and sometimes not. Qualitative comparisons revealed that models trained on field data produced velocity predictions most similar to the legacy model and geologically plausible predictions in areas where the legacy model was incomplete. Models trained on the acoustic wave equation alone sometimes produced geologically implausible results, highlighting the importance of bridging the domain gap between training data and field data. Geophysical comparisons using RTM showed that velocity models from both field data and high-quality synthetic data could generate coherent seismic images, with image coherence decreasing with increasing spatial distance from the training data. The study demonstrated that the key features of the seismic data must be accurately represented in the synthetic data to achieve high performance in synthetic-data-trained models. Using the elastic wave equation instead of the acoustic wave equation and improving the representation of sedimentary layers significantly increased the effectiveness of synthetic training datasets.
Discussion
The superior performance of the field data-trained model highlights the potential of DL for cost-effective velocity model building using readily available labeled seismic data. However, the relatively good performance of the high-quality synthetic data-trained models demonstrates the viability of using synthetic data as an alternative when field data are limited. The study emphasizes the importance of carefully designing synthetic data to accurately represent essential features of the field data, such as sediment reflections, and the use of more realistic wave equations (elastic vs acoustic) in the data generation process to effectively bridge the domain gap. The findings are significant for geoscience companies with extensive labeled seismic data, enabling the repurposing of these valuable resources for training powerful machine learning models. This can accelerate workflow and potentially revolutionize cost dynamics in subsurface projects across various applications, such as geothermal systems, carbon sequestration, and renewable gas storage. For organizations with limited field data, the study provides a path forward by creating higher quality synthetic training data.
Conclusion
This study demonstrates the effectiveness of deep learning for seismic velocity modeling using both field and synthetic data. While models trained on field data outperform those trained on synthetic data, high-quality synthetic data, especially when generated using more realistic wave equations and incorporating key geological features, can provide a valuable alternative. The findings showcase the potential for cost-effective and efficient subsurface reservoir imaging. Future research should focus on refining feature engineering, extending the approach to 3D data, and exploring advanced wave equations for more realistic synthetic data generation.
Limitations
The study's scope is limited to 2D seismic data. The legacy velocity model, while used for comparison, is not considered a perfect representation of reality, hence the performance comparisons are relative, not absolute. The generalization ability of the models, particularly those trained with synthetic data, was primarily evaluated based on spatial proximity to the training data, and further investigation is needed for testing on data that are not spatially proximate to training data. The use of a limited number of specific wave equation solvers also limits generalizability in that sense. Finally, the study is restricted to a specific region of the Gulf of Mexico, and the models' performance might vary in other geological settings.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny