logo
Loading...
Tracking lake drainage events and drained lake basin vegetation dynamics across the Arctic

Earth Sciences

Tracking lake drainage events and drained lake basin vegetation dynamics across the Arctic

Y. Chen, X. Cheng, et al.

Explore how widespread lake drainage in Arctic permafrost regions is reshaping hydrology, ecosystems, and carbon dynamics. This cutting-edge research reveals that thermokarst lakes are particularly vulnerable to drainage events, with rapid vegetation colonization potential. Discover the implications of these findings by Yating Chen, Xiao Cheng, Aobo Liu, Qingfeng Chen, and Chengxin Wang.... show more
Introduction

The Arctic is warming almost four times faster than the global average, challenging ecosystem stability and the livelihoods of indigenous communities and wildlife. Lakes are integral to Arctic ecosystem functioning, affecting carbon cycling and energy balance across their life cycle of initiation, expansion, drainage and re-initiation. Satellite observations indicate that Arctic lake-rich regions have lost lake area over the past two decades, implying that lake drainage currently exceeds lake initiation/expansion and increasing the prevalence of drained lake basins (DLBs). Lake-to-DLB transitions reduce methane emissions, increase potential for permafrost aggradation and vegetation colonization, and may shift net carbon balance toward sinks. However, the spatio-temporal distribution of specific lake drainage events, their drivers, and post-drainage vegetation dynamics across the circum-Arctic remain insufficiently resolved. Thermokarst lakes, which form in ice-rich permafrost, are considered more susceptible to drainage than non-thermokarst lakes, but both types may contribute to observed drying trends. This study addresses three questions: (1) How many lakes in the northern permafrost region are draining and where/when do these events occur? (2) What are the key environmental drivers of drainage? (3) How does vegetation in DLBs evolve after drainage and what controls greenness levels? The authors analyze drainage events from 1984–2020 and track vegetation dynamics following drainage, distinguishing thermokarst from non-thermokarst contexts and quantifying environmental controls using machine learning.

Literature Review

Prior remote sensing studies of Arctic lake dynamics often focused on limited subregions, constraining generalization due to spatial heterogeneity. Pixel-based analyses at coarse resolution linked regional drying to rising air temperatures and autumn rainfall but did not resolve individual drainage events. Thermokarst processes have been emphasized, with lakes in ice-rich permafrost exhibiting rapid lateral erosion and talik development, making them prone to drainage; vegetation in thermokarst DLBs has been reported as luxuriant relative to surroundings. Yet non-thermokarst lakes remain understudied despite their potential contribution to drying trends. Existing global surface water datasets (e.g., JRC, GLAD) enable detection of water loss, but differences in methods and limited pre-2000 observation density have hindered comprehensive event-level analyses. There is also recognition that permafrost extent, ground ice, active layer dynamics, and climate variability jointly influence lake stability, and that DLB succession can reduce methane emissions and alter regional carbon feedbacks.

Methodology

Study area and data: The northern permafrost zone was analyzed using Landsat archives (TM, ETM+, OLI) from 1984–2020 (complete coverage from 2000 onward). Two 30 m surface water products were used: JRC Global Surface Water (1984–2019) and GLAD Global Surface Water Dynamics (1999–2018). Additional datasets included: thermokarst lake likelihood (reclassified into very likely, likely, unlikely), permafrost extent (continuous, discontinuous, sporadic, isolated), Yedoma domain, ground ice content (high, medium, low), soil carbon and nitrogen, and an ecoregion map. ERA5-Land monthly reanalysis (0.1°) provided climate variables (air and soil temperature, soil moisture, precipitation/rainfall, evaporation, snowfall, snowmelt, wind, solar radiation); annual and summer means and Sen’s slopes were computed. Lake object delineation and drainage detection: An object-based approach in Google Earth Engine (GEE) delineated lake objects from the JRC water extent mask. Morphological operations (erosion then dilation, 1-pixel radius, 2 iterations) separated connected water bodies. Water loss pixels from JRC ("lost permanent") and GLAD ("water loss") were unioned. For each lake object, the drainage proportion (area of water-loss pixels relative to initial lake area) was computed. A lake drainage event was defined as a lake >1 ha with >50% loss of surface area at any time between 1984–2020. Approximately 2.3×10^5 lake objects >1 ha exhibited some water loss; 35,337 satisfied the event criterion. Drainage year estimation: Landsat-based LandTrendr temporal segmentation was applied to spectral index time series (including AWEI for water/non-water) to detect disturbance timing. Pixel-level breakpoints were identified, and the event year was assigned as the year with the maximum count of newly drained pixels within each lake. Due to sparse pre-2000 observations (especially in Siberia), temporal frequency analyses focused on 2001–2020. Vegetation dynamics: Landsat preprocessing (via EE-LCB toolkit) included filtering (cloud cover <50%), June–September seasonal window, cloud/snow/shadow masking (pixel_qa), sensor harmonization, and calculation of indices prior to compositing: NDVI, Tasseled Cap Greenness (TCG), and AWEI. Annual percentile composites (90th percentile) were generated to form robust NDVI and TCG time series. For each drained lake, annual median NDVI was extracted within the DLB after masking water (values <0 removed). Surrounding vegetation greenness was computed from a circular buffer (radius = 2× lake diameter), excluding lakes/other water. Time series were transformed to years since drainage (0–15) based on each lake’s detected drainage year. TCG was similarly extracted to corroborate NDVI patterns. Modeling environmental controls: A CatBoost binary classifier predicted drainage occurrence using drained lakes (≈28,000 events in 2001–2020) and a 1% random sample of undrained lakes (≈58,000), with 70/30 train/test split, random search and 10-fold CV for hyperparameters. Explanatory variables included climate (levels and trends), topography (elevation), permafrost attributes, thermokarst likelihood, ground ice, active layer depth proxy, solar radiation, and others (see feature selection). Collinearity (Pearson r>0.5) and negligible contributors were removed iteratively using permutation importance and SHAP (Shapley) diagnostics. A CatBoost regression model (target NDVI) was trained on annual DLB NDVI (0–15 years post-drainage) with corresponding climate/environmental variables; 70/30 split, RMSE loss, similar feature selection and CV procedures. Validation and uncertainty: Visual validation via TimeSync on a 10% random sample distinguished small vs medium-large lakes. Spatial detection accuracy was ≈82.6% (small) and 96.1% (medium-large). Temporal accuracy of drainage year was ≈63.8% (small) and 89.2% (medium-large). Known error sources included spectral anomalies causing LandTrendr misassignment (edge-year bias in 2001 and 2020), lake object merging in dense lake areas (underdetection), and coarse climate resolution (ERA5-Land) adding uncertainty in lake-level driver attribution. NDVI water effects were mitigated by 90th percentile compositing and water masking; TCG comparisons supported robustness.

Key Findings
  • Detected 35,337 lake drainage events across the northern permafrost zone (1984–2020); about half occurred in thermokarst landscapes and half in non-thermokarst contexts.
  • Size distribution: small (1–10 ha) 83.5% of events, medium (10–100 ha) 15.1%, large (>100 ha) 1.4%. Despite their rarity, medium and large events accounted for 40.9% and 35.6% of total drained area, respectively (small: 23.5%). The largest event area was ~6000 ha.
  • Spatial clustering: high concentrations in coastal lowlands, river deltas, and previously underreported clusters (e.g., St. Lawrence Island: 655 drained lakes over 4640 km²; spatial density ~80× regional average). Lowland areas (0–150 m a.s.l.) cover ~29.6% of the permafrost zone but host ~57.1% of drained lakes.
  • Lake-wise density (drained/total lakes): overall 0.61% (small 0.64%, medium 0.50%, large 0.38%). Discontinuous permafrost exhibits the highest lake-wise density (>2× average; small 1.33%, medium 1.28%, large 0.78%). In the Yedoma and very likely thermokarst regions, lake-wise density exceeds averages for all sizes, with disproportionately more large drained lakes.
  • Temporal trends (2001–2020): mean 1424 drained lakes per year (range 767–2073), with a slight upward trend (slope 23 yr⁻¹; p=0.08). Significant increases in discontinuous permafrost zones and very likely thermokarst lakes (slopes 21 and 14 yr⁻¹; p<0.001), implying +420 and +280 drained lakes over 20 years, respectively.
  • Environmental drivers of drainage (CatBoost classifier): strong performance (AUC 0.92; precision 0.84; recall 0.72; AP 0.88). Key predictors: slope of annual air temperature, mean annual air temperature (higher values increase drainage risk), elevation (higher elevations reduce risk), and active layer depth (greater depth increases risk). Lowlands and warming trends promote drainage via thermal erosion, talik development, and enhanced hydrologic connectivity.
  • Post-drainage vegetation dynamics: Thermokarst DLBs green faster and more than non-thermokarst DLBs. By year 10 post-drainage, median NDVI is ~0.72 in thermokarst DLBs vs ~0.42 in non-thermokarst DLBs. Relative to surroundings, NDVI in very likely thermokarst DLBs is higher by ~0.06 (~10%), whereas unlikely thermokarst DLBs are lower by ~0.09 (~15%). Very likely thermokarst DLBs reach surrounding NDVI in ~2 years; likely thermokarst in ~6 years; unlikely thermokarst remain below surrounding levels after 15 years.
  • Greenness controls within DLBs: larger DLBs and higher drainage area ratios show higher NDVI than surroundings; discontinuous permafrost regions show higher DLB greenness; floodplain status is associated with lower NDVI. Regional medians in year 10: Alaska 0.73, Russia 0.69 (both ~0.02–0.03 higher than surroundings), Canada 0.47 (~0.07 lower), influenced by soil/landscape conditions and size distribution.
  • NDVI prediction model (CatBoost regression): R² 0.83, RMSE 0.08, MAE 0.06. Most important factors: ecoregion and years since drainage; climatic drivers include summer air temperature and annual air temperature trend (generally positive effects). Floodplain status and non-thermokarst classification exert negative influences on NDVI. DLBs act as greening hotspots with NDVI growth rates 1–2 orders of magnitude higher than the average Arctic greening trend in early years post-drainage.
Discussion

The study resolves the spatio-temporal distribution of individual lake drainage events across the northern permafrost zone, quantifies their drivers, and tracks post-drainage vegetation trajectories. Results show that both thermokarst and non-thermokarst lakes substantially contribute to drainage totals, with small lakes being most susceptible but larger events contributing disproportionately to area loss. Warming (higher mean temperatures and positive trends), deeper active layers, and low elevation amplify drainage risk, particularly in discontinuous permafrost where hydrological connectivity to groundwater facilitates internal drainage. These findings explain observed regional drying and highlight increasing vulnerability under continued Arctic warming. The vegetation analysis demonstrates rapid colonization and sustained greening in thermokarst DLBs relative to surroundings, while non-thermokarst DLBs often lag, emphasizing the importance of lake origin and landscape context for succession and carbon dynamics. Machine learning attribution underscores the roles of regional species pools (ecoregions), years since disturbance, and temperature regimes in governing DLB greenness. The implications are multifaceted: more frequent drainage will alter hydrology, reduce lake methane fluxes, and create greening hotspots that can sequester carbon and promote permafrost aggradation, yet also pose risks such as outburst floods, infrastructure impacts, and habitat changes. The comprehensive event database and derived relationships provide a foundation for improving permafrost-hydrology representations in Earth system models and for identifying areas at heightened hazard and ecological change.

Conclusion

By integrating multi-decadal Landsat observations with global surface water products and permafrost landscape data in an object-based framework, the study detects 35,337 Arctic lake drainage events, dates their occurrence, and characterizes subsequent vegetation dynamics. Smaller lakes, thermokarst settings, low elevations, and discontinuous permafrost zones show elevated drainage susceptibility, driven chiefly by warming temperatures and deeper seasonal thaw. Following drainage, thermokarst DLBs rapidly green and often surpass surrounding vegetation, while non-thermokarst DLBs remain less green for longer. These findings identify DLBs as widespread greening hotspots with implications for hydrology, carbon cycling, and ecosystem composition. The released dataset and models can inform hazard assessment (e.g., outburst flooding), water resource management, and permafrost-carbon feedback projections. Future work should: update circumpolar ground ice datasets; improve fine-scale climate and hydrology drivers; expand validation for false negatives; examine mega-lake and clustered drainage triggers; and further investigate non-thermokarst lake processes and long-term successional trajectories beyond 15 years post-drainage.

Limitations
  • Limited pre-2000 Landsat coverage (especially Siberia) constrained event timing; temporal trend analyses centered on 2001–2020.
  • Object delineation may merge adjacent lakes in dense lowland regions, underestimating drainage ratios and underdetecting events.
  • LandTrendr can misassign disturbance years (edge-year bias in 2001/2020; spectral anomalies), yielding lower temporal accuracy for small lakes.
  • ERA5-Land climate data are coarse (0.1°), introducing uncertainty in lake-scale driver attribution despite grid-level differentiation.
  • Ground ice content dataset is outdated (>20 years), potentially biasing relationships with drainage susceptibility.
  • No comprehensive ground-truth dataset across the Arctic; false negatives are difficult to quantify. Thermokarst likelihood is map-based, not event-specific origin. NDVI can be influenced by residual water despite mitigation steps.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny