Space Sciences

Lunar impact crater identification and age estimation with Chang'E data by deep and transfer learning

C. Yang, H. Zhao, et al.

This research reveals groundbreaking advancements in lunar impact crater detection and age estimation utilizing Chang'E data. The team significantly expands the crater database with over 109,000 new identifications and age estimates for nearly 19,000 craters, making this valuable information publicly available for further exploration.

00:00

~3 min • Beginner • English

Index

Introduction

The Moon’s surface is dominated by impact craters spanning five lunar geologic systems (pre-Nectarian, Nectarian, Imbrian, Eratosthenian, Copernican), recording ~4 Ga of Solar System history. Decades of exploration have produced images, DEMs, and samples, enabling manual and automated crater catalogues. However, subjectivity and heterogeneous methods lead to discrepancies among databases. The IAU lists 9,137 recognized lunar craters, and the LPI aggregated formation ages for 1,675 craters (updated from USGS stratigraphy). Craters exhibit large scale variability, complex morphologies, and degradation states that challenge automated detection; deep learning has been applied but training data are biased toward simple craters. Relative ages are traditionally inferred from stratigraphic superposition, morphology, and optical maturity (OMAT), while absolute ages use crater size-frequency distributions (CSFDs) calibrated by returned samples, with recent thermophysical-ejecta methods for young large craters. Given limited labeled data and multi-scale morphology, the authors leverage Chang’E-1/2 (CE-1/CE-2) global products and transfer learning to: (1) detect craters at multiple scales from fused DOM and DEM; and (2) estimate relative formation systems via a dual-channel model combining image morphology and stratigraphic attributes, trained only on IAU-recognized and LPI-dated craters to ensure generalization.

Literature Review

Prior work includes manual/global crater databases (e.g., Head et al. for D ≥ 20 km using LRO/LOLA; Povilaitis et al. extending to 5–20 km; Robbins’ database with >1 million craters at D ≥ 1 km) and automated catalogues (e.g., Salamunićcar et al. via Hough transform; Wang et al. CE-1 global catalogue to D > 0.5 km; Silburt et al. CNN-based detection on LRO DEM). Age chronology methods include stratigraphic superposition and morphology-based relative dating, OMAT for large rayed craters, absolute dating through CSFDs and sample radiometric ages, and a recent regression linking rock abundance of ejecta to ages for young (≤1 Ga), D ≥ 10 km craters. Limitations remain due to scarce samples, complex impact histories, and difficulty detecting degraded/irregular craters, motivating deep and transfer learning approaches with multi-source lunar datasets.

Methodology

Data: CE-1 (120 m DOM, 500 m DEM) and CE-2 (50 m DOM, 7–500 m DEM) products were fused (DOM + DEM-derived slope and curvature), projected (Mercator at 33°), and tiled with 50% overlap. Study area spans latitudes −65° to 65° and longitudes −180° to 65°, and 65° to 180°. Recognized craters from IAU (downloaded 2018; 9,137 total; 7,895 within study area CE-1; 6,511 within CE-2) and dated craters from LPI (2015; 1,675 total; 1,491 within study area; 1,411 CE-1 and 502 CE-2 resolvable) were used. Forty morphological parameters (e.g., diameter, rim-to-floor depth, volume) and 38 stratigraphic attributes (coverage relations per USGS 1:5M renovation 2013) formed 78 per-crater attributes. Crater detection: Two-stage transfer learning with an R-FCN detector using ResNet-101 backbone pre-trained on ImageNet. Stage 1: train/fine-tune on CE-1 at two image sizes (5000×5000 and 1000×1000 pixels) to cover diameter ranges 50–600 km and 20–120 km; additionally CE-2 1000×1000 pixels target smaller craters 1–50 km. Training split on CE-1: 5,682 train, 1,422 validation, 791 test images; optimization with SGD (initial LR 0.001, decay to 0.0001 at 80k of 100k iterations; batch 128; momentum 0.9; weight decay 0.0005) in Caffe. Stage 2: transductive TL—directly transfer the Stage 1 model to CE-2 without additional training. Detection criterion uses IoU ≥ 0.5 with ground truth; recall computed as To/(To+Fo). Post-processing removes duplicates by selecting CE-1 detections for D ≥ 20 km and CE-2 for D < 20 km. Age estimation (formation system classification): Two-stage TL with a dual-channel, semi-supervised model. Channel 1: deep CNN (12 architectures: ResNet50/101/152, SE-ResNet variants, SE-ResNeXt101, PolyNet, Inception-v3, DPN68b, DenseNet201) on 256×256 DOM image chips. Channel 2: feedforward neural network on 78 morphological + stratigraphic attributes. Features from both channels are fused for 5-class classification (pre-Nectarian >3.92 Ga; Nectarian 3.92–3.85 Ga; Imbrian 3.85–3.2 Ga; Eratosthenian 3.2–1.1 Ga; Copernican <1.1 Ga). Semi-supervised Mean Teacher strategy incorporates high-confidence (≥0.99) unlabeled newly detected craters to reduce overfitting. Stage 1: initialize CNNs from ImageNet; train/fine-tune on CE-1 dated craters with 8:1:1 stratified split using Adam (LR 0.0003, epochs 10, batch 32, weight decay 0.0001) in PyTorch; ensemble the 12 model predictions via genetic algorithm–optimized weights (highest weighted sum decides class). Stage 2: transfer the best CE-1 ensemble model to CE-2 without retraining; evaluate on 502 dated CE-2 craters. Classification performance reported via overall accuracy (mean ± s.d. over five independent trials) and confusion matrices. Availability and runtime: Average detection time 0.17 s per image; age classification ~0.006 s per crater. Identified crater and model datasets are available via Figshare and GitHub.

Key Findings

- Identified 117,240 craters across mid- and low-latitudes (D ≈ 0.9–532 km), nearly 15× more than recognized craters; 88.14% have D < 10 km. Forty-six large craters (D 200–550 km) identified. - New craters not in recognized sets: 109,956. Ages (formation systems) assigned to 18,996 newly detected craters with D ≥ 8 km. - Detection performance: Stage 1 (CE-1) recall 94.71%; Stage 2 (CE-2, no training) recall 93.35%. False positive rates from manual assessment: 4.49 ± 0.70% for D = 1–100 km (sample of 10,979); 4.67 ± 2.10% for D = 100–550 km (all 166). Compared to Silburt et al. (DL-only), FPR 11 ± 7% and fewer detections, indicating improved reliability and breadth. - Cross-database matching: Overall 85.30% agreement with manual databases (Head et al., Povilaitis et al., Robbins) for D ≥ 1 km; strong consistency for D ≥ 50 km; more small/medium craters found than in several automated catalogues. Matching percentages by diameter ranges generally stable for manual sets; automated catalogues show lower densities for small D or truncation at large D. - Age estimation accuracy: CE-1 overall accuracy 85.44 ± 1.94% (best 88.97%); CE-2 89.04% correct with zero-shot transfer. Confusions mainly between adjacent systems; pre-Nectarian and Copernican highly accurate in CE-1. Ablation with stratigraphy-only features reduced OA to ~72.21 ± 3.79% (CE-1) and 77.09% (CE-2), showing benefit of integrated morphology + stratigraphy (+13% OA). - Size-frequency distributions (CSFDs): For each system, TL-derived CSFDs align with or exceed recognized-crater CSFDs at small D, extending up to ~532 km. Regional R-plots show expected contrasts between nearside mare (younger resurfaced) and farside highlands (older heavily cratered), consistent with known resurfacing and basalt flooding histories. - Consistency with external chronologies: OMAT-based categories, CSFD absolute ages, and thermophysical ejecta ages for selected craters broadly agree with assigned systems (examples listed), supporting validity of the classification.

Discussion

The study addresses two core challenges—comprehensive crater detection across scales and robust age (formation system) estimation with limited labels—by leveraging transfer learning on fused CE-1/CE-2 datasets. The two-stage TL detection recovers most recognized craters and discovers many faint, degraded, and large craters missed by prior automated methods, with low false positives. The dual-channel, semi-supervised age classifier, integrating morphological and stratigraphic features, accurately assigns craters to geologic systems and generalizes across sensors/resolutions via transductive TL. Agreement with established manual databases, OMAT, CSFD-derived ages, and thermophysical estimates indicates that the learned representations capture salient morpho-stratigraphic cues. The expanded dated-crater catalogue enables refined spatial and size-frequency analyses across lunar terrains, revealing distributions consistent with expected resurfacing, Late Heavy Bombardment signatures, and mare basalt evolution. The approach demonstrates scalability and adaptability to planetary datasets with sparse labels and heterogeneous resolutions.

Conclusion

The authors deliver a new lunar crater database containing 117,240 craters (D ≥ 1 km) and formation system ages for 18,996 craters (D ≥ 8 km) across mid- and low-latitudes, substantially expanding existing catalogues while maintaining low false-positive rates. A progressive transfer learning framework—R-FCN detection and a dual-channel, semi-supervised ensemble classifier—achieves high recall and ~89% age classification accuracy, generalizing from CE-1 to CE-2 without additional training. The results align with established crater chronologies and reveal consistent regional and system-specific crater population patterns. Future work includes extending to higher-resolution CE-2 products (20 m and 7 m) to capture smaller craters, improving and expanding stratigraphic datasets to reduce system-specific biases (e.g., Eratosthenian), and adapting the framework to other planetary bodies (Mars, Mercury, Venus, Vesta, Ceres) for rapid, consistent geomorphological mapping.

Limitations

- Under-detection of small/medium craters (D < 50 km) relative to Robbins due to limited spatial resolution (CE-1 120 m; CE-2 50 m) and CNN receptive-field scaling; rectangular window detection may poorly scale to very small diameters. - Training labels are incomplete and unevenly distributed (recognized and dated craters), potentially biasing learned features and age-class distributions (e.g., possible underestimation in Eratosthenian due to fewer dated samples). - Age estimation relies on stratigraphic information from the USGS 1:5M map and LPI database; uncertainties and incompleteness in these sources propagate to classification. - Zero-shot transfer to CE-2 assumes domain similarity; residual domain shifts may affect edge cases (e.g., confusion between adjacent systems). - Study area excludes high latitudes (|lat| > 65°), limiting global generalization pending polar coverage.

Related Publications

Explore these studies to deepen your understanding of the subject.

Engineering and Technology

Stretchable and anti-impact iontronic pressure sensor with an ultrabroad linear range for biophysical monitoring and deep learning-aided knee rehabilitation

H. Xu, L. Gao, et al.

Biology

Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter

A. Hoarfrost, A. Aptekmann, et al.

Physics

Evolving scientific discovery by unifying data and background knowledge with AI Hilbert

R. Cory-wright, C. Cornelio, et al.

Medicine and Health

Targeting senescence induced by age or chemotherapy with a polyphenol-rich natural extract improves longevity and healthspan in mice

S. Zumerle, M. Sarill, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny