logo
ResearchBunny Logo
Improving biodiversity protection through artificial intelligence

Environmental Studies and Forestry

Improving biodiversity protection through artificial intelligence

D. Silvestro, S. Goria, et al.

Over a million species face extinction, underscoring the need for innovative conservation solutions. This groundbreaking paper presents CAPTAIN, a reinforcement learning framework that outshines existing methods by successfully prioritizing conservation efforts, ensuring more species protection under budget constraints, thanks to the collaborative work of Daniele Silvestro, Stefano Goria, Thomas Sterner, and Alexandre Antonelli.... show more
Introduction

The study addresses how to optimize biodiversity protection under limited budgets in a rapidly changing world with increasing anthropogenic pressures and climate change. Despite biodiversity’s intrinsic and utilitarian values, global conservation targets (for example, the Aichi Biodiversity Targets) were not fully met in 2011–2020, and the need for more realistic and effective policies is urgent. Traditional spatial conservation planning tools often optimize static, one-off solutions and do not directly incorporate temporal dynamics, changing threats, or species-specific sensitivities. The research introduces CAPTAIN, a reinforcement learning-based framework designed to optimize dynamic conservation policies informed by static or recurrent biodiversity monitoring, to minimize species loss or meet user-defined objectives while accounting for costs, climate change, and anthropogenic disturbance. The work evaluates trade-offs among policy objectives, the influence of monitoring strategies, and performance relative to established tools, and demonstrates applicability with an empirical case from Madagascar.

Literature Review

Since the 1960s, conservation has evolved from preserving nature for its own sake to integrating human–nature linkages and ecosystem services. Spatial conservation prioritization and systematic conservation planning emerged to identify priority areas considering trade-offs among biodiversity, costs, and other factors. Widely used tools like Marxan use simulated annealing to identify cost-efficient protected area networks meeting representation targets, sometimes incorporating rarity, threat, phylogenetic diversity, or area constraints. However, these approaches generally optimize static, one-time policies, do not inherently model temporal dynamics, and require manual re-runs to explore changes. They also have limited direct integration of climate change, variable anthropogenic pressures, and species-specific sensitivities (though some extensions handle variable threats). Previous AI applications in conservation exist, but reinforcement learning has largely been proposed rather than practically implemented for spatial planning. CAPTAIN addresses this gap by optimizing dynamic policies under changing threats and evolving biodiversity states.

Methodology

The authors developed CAPTAIN (Conservation Area Prioritization Through Artificial Intelligence), a reinforcement learning framework coupled with a spatially explicit, individual-based biodiversity simulation. The simulation models thousands of species and millions of individuals across spatial cells, incorporating natural processes (mortality, replacement, dispersal) and anthropogenic processes (habitat modification, selective removal, rising disturbance) as well as climate change effects (altered mortality based on species-specific climatic tolerances), enabling range shifts and changes in population abundances. Policy actions include: (1) monitoring, which extracts features per protection unit (for example, species presence/absence, abundances under full monitoring, rarity, complementarity metrics, local disturbance, costs, climate) at predefined temporal frequencies (initial-only or recurrent); and (2) protecting, which selects protection units (aggregations of adjacent cells) to reduce anthropogenic disturbance to low levels (with possible edge effects), subject to a finite budget and spatially varying protection costs (often proportional to disturbance). Protection does not mitigate climate change within units. Optimization objective (reward) can be configured to: minimize species extinctions (equal value per species), minimize loss of cumulative species value (for example, economic, phylogenetic), or maximize protected area (irrespective of biodiversity). Under budget constraints, the RL agent learns to select units over time to maximize the reward, using monitoring-derived features. Species-specific sensitivities to disturbance are not assumed known; instead, they are inferred indirectly from observed changes in relative abundances. Policy architecture: A feed-forward neural network maps unit-level features to probabilities of protection actions. The network uses a hidden layer with ReLU activation and a softmax output to produce action probabilities per unit, with parameter sharing across units to reduce dimensionality. Spatial context is encoded via engineered features; convolutional layers are identified as a future enhancement. Training algorithm: A parallelized evolution strategies approach optimizes network parameters without explicit gradient computation. Across epochs, policy parameters are perturbed, full episodes (time-forward simulations) are run to obtain rewards, and parameters are updated using a stochastic gradient estimate weighted by an advantage function (improvement over an exponential moving average of rewards). Monitoring cost is set to zero in experiments; protection cost depends on unit cost and remaining budget. Policies can place one protected unit per time step (dynamic) or spend the entire budget in a single step (static one-time policy). Experiments included: (i) assessing monitoring strategies (full recurrent monitoring of presence and abundance; citizen science-like recurrent presence/absence with error; full initial-only monitoring) under a species-loss minimization objective; (ii) exploring trade-offs by optimizing for economic value retention or maximizing total protected area; (iii) benchmarking against Marxan under static and dynamic setups with explicit budget constraints. An empirical application to Madagascar endemic trees used presence/absence across 22,394 units, with costs proportional to disturbance; because abundances were unavailable and species’ disturbance sensitivities unknown, presence was resampled accounting for unit disturbance and random species sensitivities. Static, one-time protection was used with a target of at least 10% of each species’ range protected, comparing CAPTAIN and Marxan (with and without boundary length multiplier).

Key Findings
  • Monitoring strategies: Full recurrent monitoring protected on average 26% more species than a random policy; citizen science-like recurrent monitoring (presence/absence with typical error) achieved a similar 24.9% improvement; full initial-only monitoring achieved 20% improvement over random. Policies with recurrent monitoring (full or citizen science) outperformed random in 97.2% of simulations, versus 91.2% for full initial-only monitoring.
  • Optimization trade-offs: Optimizing to minimize loss of cumulative economic value prioritized fewer, high-value species, yielding only a 10.9% reduction in species losses relative to random and reducing species richness, phylogenetic diversity, and protected area compared to minimizing species loss. Optimizing to maximize protected area increased protected cells by 27.6% by selecting cheaper units but caused worse biodiversity outcomes, with 13.6% more species losses on average than random and substantially worse than minimizing species loss.
  • Winners and losers: Under policies minimizing species loss, extinct species tended to have small initial ranges, small populations, and low to intermediate disturbance resilience. Survivors either had high resilience with small ranges/populations or low resilience but large ranges and high populations. CAPTAIN learned to protect complementary species assemblages, resulting in higher cumulative species coverage than random or naive richness-based selection; protected units were not restricted to the highest-richness cells.
  • Benchmarking with Marxan (simulations): With static, one-time protection (full initial monitoring), CAPTAIN outperformed Marxan in 64% of 250 simulations, with an average 9.2% relative improvement (lower species loss). Under dynamic protection (full recurrent monitoring, one unit per time step), CAPTAIN outperformed Marxan in 77.2% of simulations, with an average 18.5% reduction in species loss.
  • Empirical case (Madagascar endemic trees): Under a budget allowing protection of up to 10% of units and a target of at least 10% of each species’ range protected, CAPTAIN met the target for all species in 68% of replicates, versus up to 2% for Marxan. Median fraction of species’ range protected was 22% with CAPTAIN versus 14% with Marxan. CAPTAIN produced higher-resolution, more interpretable priority maps. Regular monitoring and dynamic deployment further improve outcomes in simulations.
Discussion

The findings show that reinforcement learning can optimize conservation actions under budget constraints and dynamic threats more effectively than traditional static planning. Recurrent monitoring, even using presence/absence data with citizen science-like error, markedly improves conservation outcomes and policy reliability by capturing temporal dynamics in disturbance, population changes, and range shifts. Trade-off analyses caution against optimizing for surrogates such as total area protected or economic value when the objective is biodiversity persistence; these objectives can substantially increase species losses compared to policies minimizing extinctions. CAPTAIN’s learned complementarity indicates that prioritizing a diverse set of units across gradients, rather than only the most species-rich, better safeguards more species. Benchmarking demonstrates superior performance over Marxan in both static and dynamic scenarios, and the empirical Madagascar case illustrates practical applicability, higher frequency of meeting species-range targets, and more interpretable prioritization. Overall, integrating AI-driven dynamic planning with regular monitoring offers a robust path to meet biodiversity targets in a changing world.

Conclusion

This study introduces CAPTAIN, a reinforcement learning-based framework that integrates biodiversity monitoring, dynamic threats, and budget constraints to optimize spatial conservation policies. Across simulations and an empirical Madagascar dataset, CAPTAIN consistently outperforms random, naive richness-based, and Marxan-optimized solutions, more reliably meeting species protection targets and minimizing extinctions. The work underscores the importance of recurrent monitoring and of optimizing directly for biodiversity outcomes rather than surrogates like area or economic value. CAPTAIN’s flexible design allows incorporation of diverse biodiversity metrics (for example, phylogenetic or functional diversity) and custom objectives (for example, carbon storage), and is well suited to leverage expanding high-resolution datasets and remote sensing. Future research should extend spatial modeling (for example, convolutional architectures, explicit contiguity constraints), integrate additional socio-ecological constraints, apply transfer learning to empirical time series and projections, and evaluate long-term management effectiveness to further enhance real-world conservation planning.

Limitations
  • Static versus dynamic analyses: Many existing tools and the Madagascar case were analyzed with one-time (static) placement, whereas optimal dynamic deployment over time yields better outcomes; thus, static results may underestimate benefits of recurrent monitoring and dynamic action.
  • Species sensitivity and data limitations: Species-specific sensitivities to disturbance and climate were not known a priori and were inferred indirectly from abundance changes or approximated by random draws in the Madagascar case. Presence/absence data derived from species distribution models may not reflect true occupancy under high disturbance.
  • Approximate disturbance and climate effects: Protection reduces anthropogenic disturbance but does not mitigate climate change within units, so climate-driven losses can persist despite protection. Edge effects are simplified, and restoration dynamics after protection are modeled but may not match real recovery trajectories.
  • Spatial modeling constraints: The neural network shares parameters across units and encodes spatial information via engineered features; spatial contiguity and shape (for example, boundary length penalties) are not explicitly modeled, which may affect reserve design in some contexts.
  • Generalizability and socio-political factors: Results are based on simulations with stochastic variability and a single empirical taxonomic group and region; broader validation across taxa, regions, and socio-economic contexts is needed. Real-world constraints (governance, land tenure, enforcement capacity) are not explicitly modeled but can affect implementability and outcomes.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny