Biology

DeepDive: estimating global biodiversity patterns through time using deep learning

R. B. Cooper, J. T. Flannery-sutherland, et al.

Discover DeepDive, a groundbreaking method developed by Rebecca B. Cooper, Joseph T. Flannery-Sutherland, and Daniele Silvestro, that leverages deep learning to analyze global biodiversity changes over time. This innovative approach addresses the limitations of the fossil record, yielding robust palaeodiversity estimates and fresh insights into mass extinctions and diversity fluctuations.

00:00

~3 min • Beginner • English

Index

Introduction

The study addresses how to estimate biodiversity trajectories through deep time despite the incompleteness and biases of the fossil record. The fossil record is essential for inferring extinction, recovery, expansion, and turnover, and for testing hypotheses about global limits to biodiversity and the drivers of biodiversity change. Yet it is affected by temporal, spatial, and taxonomic heterogeneity arising from preservation differences, accessibility of sites, and sampling effort, causing a mismatch between true and observed diversity. Existing methods largely adjust for temporal variation in sampling intensity but do not adequately account for variation in geographic scope, temporal duration, environmental representation, or taxonomic biases, which can strongly distort global estimates. The research question is whether a simulation-informed deep learning framework can accurately infer global and regional biodiversity through time while explicitly incorporating spatial, temporal, and taxonomic sampling biases.

Literature Review

Prior approaches include rarefaction and coverage-based methods, maximum likelihood and Bayesian models grounded in Poisson sampling processes, and lower-bound richness extrapolators. Shareholder Quorum Subsampling (SQS) is widely used to standardize for sampling intensity, but it estimates relative rather than absolute diversity and is not designed to correct for changing spatial scope or taxonomic biases. Analyses show that spatial sampling heterogeneity can account for 50–60% of changes in standardized richness estimates in the shallow marine record, underscoring the need for spatially explicit methods. Recent work has moved toward spatially explicit diversity inference and theoretical models that simulate plausible biodiversity patterns, yet these do not directly solve global diversity estimation and typically neglect taxonomic sampling biases. Some models accommodate lineage-specific preservation rates but do not explicitly propagate their effects on observed lineages.

Methodology

DeepDive comprises two modules: (1) a biodiversity simulator and (2) a deep learning inference model. The simulator generates global and regional diversity trajectories via stochastic birth–death processes and assigns species to discrete biogeographic regions, then degrades the complete record into a biased fossil record by imposing spatial, temporal, and taxonomic heterogeneity in fossilization and sampling. Biases include region- and time-specific variation in locality numbers, gaps in the record, and heterogeneous preservation rates, with options for temporal trends or piecewise shifts and for strong spatial biases. Simulations can include mass extinction/speciation events and diversity-dependent dynamics and can be customized with empirical constraints (e.g., known biogeographic connectivity changes, timing of clade origins, mass extinction timings). The simulated fossil record is summarized as features per time bin (e.g., counts of sampled taxa, occurrences, localities, singletons, endemics, range-through diversity, and per-region counts). The inference module is a recurrent neural network with bidirectional LSTM units plus optional fully connected layers, trained to map fossil features to absolute diversity through time (log-transformed output). Training uses mean squared error loss and ADAM optimization, with early stopping on validation MSE. Multiple architectures were tested; performance was consistent (validation MSE ≈ 0.114–0.132; test MSE ≈ 0.197–0.229). Uncertainty is quantified via Monte Carlo dropout, aggregating predictions across dropout passes and trained models to compute 95% confidence intervals. Performance metrics include MSE, rescaled MSE on [0,1] diversity (rMSE), and R² on untransformed diversity. Comparisons were made to SQS (relative diversity) using rMSE and R². Additional test datasets introduced strong temporal, taxonomic, and spatial biases and scenarios rare in initial training (e.g., multiple mass extinctions/speciations, diversity dependence followed by mass extinction). Retraining with these scenarios improved performance around abrupt changes. Two empirical applications were run with customized simulations reflecting binning and regionalization: (i) Late Permian–Early Jurassic marine genera across five regions and known PTME/TJME, and (ii) Cenozoic Proboscidea species with constrained continental colonization timings and minimum extant species. Models were trained on up to 150,000 simulations with held-out test sets (1,000 simulations) for each empirical case.

Key Findings

- Across simulations, DeepDive substantially outperformed SQS: median R² ≈ 0.958 vs 0.432 and over an order of magnitude lower median MSE on test sets. Predictions were robust to gaps and captured both gradual and abrupt diversity changes better than SQS. - Performance remained higher than SQS under low completeness and preservation rates; strong spatial and taxonomic biases widened the performance gap in favor of DeepDive. Under strong spatial bias, median R² for DeepDive exceeded 0.79. Retraining to include mass extinction/speciation and diversity dependence further improved estimates near abrupt shifts. - Model architectures yielded similar accuracy (validation MSE ~0.114–0.132; test MSE ~0.197–0.229), indicating stability across parameterizations. Monte Carlo dropout CIs did not always enclose true simulated values in a non-negligible fraction of bins. - Empirical marine genera (Late Permian–Early Jurassic): estimated up to 58% genus loss across the PTME (mean ≈24% loss), recovery in the Early Triassic surpassing pre-PTME levels by the Middle Triassic; later decline through the Late Triassic and sharper losses around the Triassic–Jurassic boundary with up to 66% genus loss (total 42% loss across the boundary). - Empirical Proboscidea (Cenozoic species): gradual Paleogene increase to an estimated 100–200 species by the early Miocene; Early Miocene step increase yielding roughly 35–78 contemporaneous species in the Middle–Late Miocene; high, variable diversity through the Pliocene and early Pleistocene followed by a Pleistocene crash to 10–27 species, implying on average ~65% (up to 87%) loss within the Pleistocene and more than 70% reduction since ~2.5 Ma (mean ~85% since the Pleistocene).

Discussion

The results demonstrate that combining mechanistic simulations with supervised deep learning can recover absolute diversity trajectories from biased fossil data more accurately than commonly used standardization methods such as SQS, particularly under spatial heterogeneity. By explicitly encoding spatial, temporal, and taxonomic sampling variation in training data, the RNN learns how fossil features relate to true diversity across diverse scenarios, capturing both smooth trends and abrupt mass extinction signals. Empirical applications revise the magnitudes and timing of diversity changes across the PTME and TJME and suggest higher-than-observed species-level diversity in Proboscidea, consistent with elevated fossil record completeness but still indicating substantial undersampling. The framework is modular and extensible, allowing inclusion of prior geological and biogeographic knowledge and alternative predictive models. However, predictive accuracy depends on the training set spanning the scenarios present in the empirical data; model performance decreases for dynamics absent or rare in training. Care is needed when interpreting estimates around mass extinctions due to potential errors at abrupt changes. Overall, the approach provides a powerful path to reassessing macroevolutionary dynamics and diversity limits at regional to global scales.

Conclusion

DeepDive introduces a simulation-informed deep learning framework that infers absolute biodiversity through time from fossil occurrence data while accounting for spatial, temporal, and taxonomic biases. It consistently outperforms SQS across simulations, remains robust under severe sampling heterogeneity, and produces biologically coherent empirical estimates for marine genera around the PTME/TJME and for Proboscidea across the Cenozoic, including a pronounced Pleistocene diversity crash. The method opens avenues to reevaluate major transitions in the history of life. Future work can expand the generative models (e.g., explicit biotic interactions, individual-based and spatially explicit population processes), integrate alternative time-series predictive models, refine uncertainty quantification, and systematically test the effects of differing prior assumptions and bias structures across clades and time intervals.

Limitations

- Ground truth diversity is unknowable for real data; validation relies on simulations, limiting external verification. - Predictive accuracy decreases when empirical scenarios (e.g., particular mass extinctions or unique spatial biases) are absent or rare in training simulations, risking erroneous predictions. - Monte Carlo dropout intervals did not contain true simulated values in a non-trivial number of time bins, indicating underestimation of predictive uncertainty in some cases. - Simulations simplify biogeography to discrete regions and do not explicitly model biotic interactions; such simplifications may miss dynamics relevant to real systems. - Strong abrupt events can increase error; careful interpretation is required around mass extinction intervals. - Results can be sensitive to prior choices in simulation parameterization; comprehensive coverage is computationally demanding.

Related Publications

Explore these studies to deepen your understanding of the subject.

Earth Sciences

Predicting global patterns of long-term climate change from short-term simulations using machine learning

L. A. Mansfield, P. J. Nowack, et al.

Biology

COSMOS: a platform for real-time morphology-based, label-free cell sorting using deep learning

M. Salek, N. Li, et al.

Biology

Uncovering developmental time and tempo using deep learning

N. Toulany, H. Morales-navarrete, et al.

Medicine and Health

Discriminating pseudoprogression and true progression in diffuse infiltrating glioma using multi-parametric MRI data through deep learning

J. Lee, N. Wang, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny