
Environmental Studies and Forestry
Deep learning for detecting and characterizing oil and gas well pads in satellite imagery
N. Ramachandran, J. Irvin, et al.
This research, conducted by Neel Ramachandran, Jeremy Irvin, Mark Omara, Ritesh Gautam, Kelsey Meisenhelder, Erfan Rostami, Hao Sheng, Andrew Y. Ng, and Robert B. Jackson, presents a groundbreaking deep learning approach to mapping oil and gas infrastructure using high-resolution satellite imagery, revealing previously unmapped well pads and storage tanks in key basins.
Playback language: English
Introduction
Methane, a potent greenhouse gas, is a significant contributor to climate warming. Anthropogenic sources, particularly fossil fuels (contributing 35% of anthropogenic methane emissions), are responsible for a substantial increase in methane emissions over recent decades. A large portion of oil and gas (O&G) industry methane emissions originates from production infrastructure, especially well pads and storage tanks. Accurate quantification and source attribution of methane emissions are crucial for effective mitigation strategies. This requires a comprehensive and accurate geospatial database of O&G infrastructure. Current data sources, such as the Department of Homeland Security's Homeland Infrastructure Foundation-Level Data (HIFLD) program, suffer from data gaps and inconsistencies due to outdated information, varying state-level reporting, and the underreporting of sub-facility equipment like storage tanks. These data gaps hinder bottom-up emission estimates, which are frequently observed to underestimate total emissions. While top-down estimates from satellites are improving, reconciling these top-down and bottom-up approaches requires a more complete infrastructure database. The availability of high-resolution satellite imagery and advancements in deep learning provide a promising opportunity to address these data limitations by automatically mapping O&G infrastructure at scale. Previous work has explored deep learning for O&G infrastructure mapping in limited areas, but this research aims to develop and deploy a more extensive framework, rigorously validated across multiple, high-producing basins.
Literature Review
The literature highlights the urgent need for accurate methane emission quantification and source attribution to effectively mitigate climate change. Studies emphasize the significant contribution of methane emissions from the oil and gas sector, specifically from well pads and storage tanks. Existing geospatial databases, while valuable, suffer from limitations in coverage, accuracy, and timeliness. This necessitates the development of novel approaches to overcome these data gaps. Several research efforts have explored the use of remote sensing and deep learning techniques for mapping various types of infrastructure, including solar farms, wind turbines, and refineries, demonstrating the potential for automated large-scale mapping. However, the application of these techniques to O&G well pads and storage tanks at a basin-wide scale remains relatively unexplored, creating the opportunity for this study. The paper references previous work that demonstrated the potential of deep learning to detect O&G well pads and storage tanks but lacked large-scale validation and comparison against existing datasets.
Methodology
This study develops and deploys deep learning models to detect O&G well pads and storage tanks in the Permian and Denver-Julesburg basins. The methodology comprises several key stages:
1. **Data Acquisition and Annotation:** A labeled dataset of satellite imagery was created using a combination of expert annotation and crowd-sourcing, incorporating both “prototypical” well pads and a more random sample to address label bias and more accurately represent the diversity of well pad sizes and appearances. The dataset includes images of well pads and storage tanks, along with numerous negative examples to improve model discriminative ability. Negative examples included various features that could be confused with well pads (roads, wind turbines, etc.).
2. **Model Development and Training:** The research team experimented with multiple deep learning architectures and backbones for object detection (RetinaNet with ResNet-50 for well pad detection, Faster R-CNN with Res2Net for storage tank detection), optimizing hyperparameters to maximize performance. A two-stage approach was employed for well pad detection: a detection model to identify potential well pads, followed by a verification model to eliminate false positives. The models were trained using standard optimization procedures on a substantial dataset of images. Augmentation techniques were applied during training to improve model robustness.
3. **Model Evaluation:** Rigorous evaluation was performed using metrics such as average precision (AP), precision, recall, and mean absolute error (MAE). The models were evaluated on a held-out test set, stratified by well pad size, and assessed for generalization to new, unseen basins. The study compared single-basin and jointly trained models, determining the effectiveness of joint training. The impact of the verification model in improving precision was also assessed.
4. **Basin-Scale Deployment:** The trained models were deployed to the entire Permian and Denver-Julesburg basins. The basins were tiled into numerous overlapping images which were processed through the detection and verification pipelines. Detections were matched against existing datasets (Enverus and HIFLD) to assess recall. Verified well pad detections were then processed through the storage tank detection model.
5. **Analysis and Interpretation:** The results from the basin-scale deployment were analyzed against reported data, examining captured, missed, and new well pad and storage tank detections. The impact of outdated imagery on recall, particularly for recently constructed well pads, was investigated. The relationship between storage tank count and well pad production was also explored.
Key Findings
The deep learning approach achieved high performance on expert-curated datasets, with precision and recall exceeding 0.9 for both well pads and storage tanks. When applied to the entire basins, the model captured a majority (79.5%) of well pads present in existing datasets and detected over 70,000 additional well pads not present in those datasets. Furthermore, the study identified over 169,000 storage tanks on well pads, which were previously unmapped. The study found that the model performed better on larger well pads and that performance varied by basin, with lower performance observed in the Denver-Julesburg Basin, primarily attributed to the higher prevalence of smaller well pads and a more complex built environment. The two-stage detection approach significantly improved overall precision by reducing false positives. The model generalized well to some new regions but performed poorly in regions with substantially different visual characteristics. Basin-scale deployment revealed that outdated imagery in the satellite basemap significantly impacted the detection of recently constructed, high-producing well pads. Analysis of storage tank detections indicated a moderate correlation with overall production, and a higher correlation with gas production than oil production. The study produced comprehensive datasets of well pad and storage tank locations, greatly expanding upon existing databases.
Discussion
The findings demonstrate the effectiveness of deep learning in automating the mapping of O&G well pads and storage tanks at a basin-wide scale, significantly improving the completeness of existing infrastructure datasets. The high precision and recall achieved on both expert-curated and basin-scale datasets highlight the potential of this approach to enhance methane emission estimation and source attribution. The identification of a substantial number of previously unmapped well pads and storage tanks underscores the significant data gaps in current databases. The model's sensitivity to well pad size and the challenges posed by outdated imagery highlight the need for further improvements in data quality and model robustness. The observed correlation between storage tank counts and production provides valuable insights into the relationship between infrastructure characteristics and emission sources, potentially informing future research on emission modeling. This work contributes to a broader effort to create comprehensive and accurate geospatial databases of O&G infrastructure, which are essential for effective methane emissions monitoring and mitigation.
Conclusion
This study successfully demonstrates a deep learning approach for mapping O&G well pads and storage tanks using high-resolution satellite imagery. The methodology achieved high accuracy and identified significant numbers of previously unmapped structures. The findings emphasize the potential of this approach for improving methane emission estimations and source attribution, particularly in data-scarce regions. Future research should focus on addressing the limitations related to outdated imagery, improving model generalization to diverse environments, and expanding the detection capabilities to include other O&G infrastructure components. The development of a globally scalable framework would significantly enhance our ability to monitor and mitigate methane emissions from the O&G sector.
Limitations
The study acknowledges several limitations. Outdated imagery in the Google Earth basemap limited the detection of recently constructed well pads, particularly impacting high-producing wells. Model performance varied between basins due to differences in well pad size, density, and the surrounding environment. The training dataset, while extensive, may not fully represent the diversity of well pad characteristics and surrounding landscapes present in all regions, leading to some performance discrepancies between the test set and basin-scale deployment. The use of a lower IoU threshold for well pad detection may have influenced the overall results.
Related Publications
Explore these studies to deepen your understanding of the subject.