Environmental Studies and Forestry
Under the karst: detecting hidden subterranean assemblages using eDNA metabarcoding in the caves of Christmas Island, Australia
K. M. West, Z. T. Richards, et al.
Subterranean environments are notoriously underexplored. It has been estimated that more than 80% of Australia’s subterranean fauna have yet to be discovered. Subterranean environments can be found above and below the water table (e.g., caves, cavities, aquifers and anchialine systems) and host a high diversity of largely invertebrate taxa adapted to darkness and variable temperature, nutrients, salinity, dissolved oxygen and, in some cases, stratified waters. Many species are short-range endemics due to fragmentation limiting gene flow. Commonly surveyed aquatic stygofauna include freshwater and saltwater fishes, eels, gastropods, salamanders, flatworms, beetles, water mites and crustaceans (amphipods, decapods, isopods, ostracods, copepods, syncarids). Genetic techniques combined with traditional biospeleological sampling yield insights into stygofauna diversity, lineages, and speciation. Most genetic studies rely on single-source “barcoding” or genome building from captured specimens, but capture-based sampling is difficult in subterranean environments with inaccessible voids and networks. There is a need for complementary, non-invasive tools to gauge stygofauna diversity and distribution. Environmental DNA (eDNA) metabarcoding has rapidly developed for marine, freshwater and terrestrial environments as an efficient, non-invasive, sensitive detection tool, but its application to subterranean habitats is less explored. Preliminary work shows underground water samples can reveal multi-species communities, though often focusing on microbes. Eukaryotic stygofauna detection may be hindered by incomplete reference databases due to limited surveying and barcoding. Nonetheless, eDNA holds promise for detecting and monitoring both described and undescribed eukaryotes, investigating community assemblages across trophic levels and haloclines, stygofauna evolution and population diversity, and underground interconnectivity. Christmas Island (CI) in the Indian Ocean is a re-emerged seamount with a developed karst landscape and ~30 accessible caves spanning plateau/freshwater stream caves, fissure, collapsed and sea/coastal caves. Rainfall percolates through limestone and discharges at coastal/offshore springs and some major inland springs. Cave fauna form a significant component of the island’s unique ecosystem, with at least 17 endemic cave species documented, yet diversity and distribution of stygofauna, especially anchialine fauna, remain under-studied. Karst is used for phosphate mining and as a water supply, and interconnectivity among caves is largely unknown though suspected among nearby systems (e.g., Whip and Runaway Caves). Objectives: (1) identify putative new occurrence and extend distribution records for CI’s subterranean stygofauna using eDNA metabarcoding; (2) assess variation in community composition of cave and spring sites; and (3) investigate potential underground interconnectivity by combining eDNA-derived community composition with environmental (water quality) data, thereby evaluating eDNA metabarcoding as a non-invasive tool for biospeleological assessment.
Prior work indicates subterranean ecosystems have truncated food webs and high endemism, with many taxa adapted to darkness and variable abiotic conditions. Traditional biospeleological surveys on Christmas Island using traps and visual methods have documented fauna but face challenges accessing voids and detecting elusive species, and previous surveys reported 13–54 species across various efforts. Genetic approaches have mostly focused on specimen-based barcoding/genomics, limiting scalability in inaccessible systems. eDNA metabarcoding has proven effective across marine, freshwater and terrestrial systems for taxon detection and richness estimation, and preliminary subterranean applications have largely targeted microbial communities. Incomplete reference databases for subterranean eukaryotes (due to limited voucher collections and formalin-preserved specimens) hinder species-level assignments. Given CI’s karst hydrology and suspected cave interconnections, integrating eDNA with environmental data offers opportunities to infer biotic community structure, salinity-driven assemblage turnover, and potential underground connectivity.
Study area and sampling: In October 2018 (late dry season), six 1 L water replicates and one 50 mL sediment sample were collected at each of 23 sites (caves and springs) across Christmas Island (total 159 samples over ~110 km²). Sediment was not collected at Jedda and Jane-up Caves due to water-supply restrictions; water was obtained via WaterCorp testing taps. Water was collected in bleach-sterilised Nalgene bottles, stored on ice, and filtered within 2 hours onto 0.2 µm PES membranes using a Pall Sentino pump; filtration equipment was bleach-cleaned between samples, and a daily 1 L bleach filtration served as a filtration control. Filters and sediments were frozen at −20 °C and transported to the TrEnD Laboratory (Curtin University). Environmental parameters (pH, temperature, conductivity, salinity, air saturation, dissolved oxygen) were measured onsite using handheld meters. Laboratory processing: DNA was extracted from half filters and 250 mg sediment using DNeasy PowerLyzer PowerSoil Kit, with filtration controls and extraction blanks processed alongside. Three PCR metabarcoding assays were used: mitochondrial 16S Fish (short) targeting bony fishes, 16S Crustacean targeting crustaceans, and nuclear 18S Universal targeting eukaryotes. qPCR used fusion-tagged primers with Illumina adapters and 8 bp indices; each eDNA sample was amplified in duplicate. Amplicons were pooled equimolarly by qPCR ARn, size-selected (16S: 160–450 bp; 18S: 200–600 bp), purified, quantified, and sequenced on Illumina MiSeq (V2 300-cycle for 16S; 500-cycle paired-end for 18S). Bioinformatics: Reads were demultiplexed (OBITools/insect), quality filtered and denoised with DADA2 to produce ASVs. ASVs were BLASTn-searched against GenBank (2019) and a curated WA 16S fish database (via Pawsey). Taxonomy was assigned by a lowest common ancestor approach and curated with environment/biogeography metadata (CI surveys, WoRMS). Putative new occurrence records were evaluated for congeneric barcode completeness. ASVs found in blanks/controls were removed. Identical Linnaean assignments were merged (phyloseq tax_glom). Read abundances were converted to presence/absence for analyses. Statistical analyses: Presence/absence matrices (Jaccard similarity) tested differences between sample types (water vs sediment) and sites using two-way crossed PERMANOVA; PCO visualised variation. Species accumulation curves (vegan specaccum) assessed replicate sufficiency; ggplot2 compared taxa per replicate; SIMPER identified taxa contributing to dissimilarity between sample types. Replicates were merged per site to assess between-site composition (Jaccard); taxa accumulation across sites was generated. DistLM (PERMANOVA+) tested environmental/spatial predictors (pH, temperature, salinity, dissolved oxygen, latitude, longitude; conductivity and air saturation omitted due to collinearity) across all sites, caves only, and springs only; significant predictors were overlaid on PCO ordinations. SIMPER identified taxa characterising site type (cave/spring) and salinity categories (freshwater ≤0.49 ppt; oligohaline 0.5–4.9 ppt; mesohaline 5.0–17.9 ppt; polyhaline 18.0–29.9 ppt; euhaline ≥30 ppt). Hierarchical clustering with SIMPROF was applied to biotic (community) and abiotic (environmental, including lat/long) data to infer potential underground interconnectivity.
- Sequencing output: 35,698,221 reads across 159 samples. Mean filtered reads per replicate: 16S Fish (short) 74,014 ± 73,211; 16S Crustacean 66,152 ± 115,417; 18S Universal 22,968 ± 16,704. The 16S Crustacean assay showed unbalanced reads at some sites, consistent with low crustacean eDNA template.
- Contaminant/omitted ASVs removed included Phoxinus, Fredericella sultana, Homo sapiens, Gallus gallus, Felis catus, Sus scrofa, Bos taurus, Meleagris gallopavo, bacteria, Gastrotricha, fungi, ciliates, nematodes, microscopic flatworms, plants and cryptophyte algae.
- Sampling sufficiency: Six 1 L water replicates were insufficient to maximize richness; extrapolation suggested on average 18.6 ± 20.4 one-litre water replicates required per site. Adding a sediment replicate increased detected diversity.
- Water vs sediment: Significant compositional differences between water and sediment (PERMANOVA P=0.000). A single water sample detected on average more taxa than a single sediment sample (not statistically significant). Sediment yielded higher detection for Geograpsus crinipes and Penaeus vannamei.
- Diversity detected: 115 identifiable taxa across Chordata, Cnidaria, Porifera, Arthropoda, Mollusca, Annelida, Bryozoa, representing 71 families in 60 orders. Taxonomic resolution: 37.4% species, 20.8% genus, 17.4% family, 22.6% order, 1.7% class. Environmental associations: marine 53.9%, terrestrial 37.4%, freshwater 20.9%, brackish 14.8%. Biogeography: 64.3% circumglobal; 6.1% Indo-West Pacific; 5.2% Indo-Pacific.
- Fish: 13 actinopterygian taxa from 11 families in 8 orders, predominantly marine taxa in anchialine caves (e.g., Caranx ignobilis, Sillago aeolus, Melichthys niger, Oxyporhamphus micropterus). Freshwater detections included Anguilla bicolor bicolor, Eleotris sp., Cyprinidae. Putative new fish records: Cyprinidae, Cottus, Gobio gobio (species-level unverified due to incomplete congeneric barcodes).
- Arthropods: 47 taxa (insects, arachnids, crustaceans, collembola, millipedes). Land crabs (Tuerkayana magnum, Geograpsus crinipes, Geograpsus grayi, Gecarcoidea lalandii) detected in cave entrances. Aquatic taxa included Artemia franciscana, Nitokra (copepod), Atyidae (shrimp), ostracods (Darwinula stevensoni, Schlerochilus), Penaeus vannamei.
- Notable records: Putative new occurrence records total 21 across taxa; detection of Willowsia (Collembola) at five sites; Litopenaeus/Penaeidae at three caves; extended distributions for Anguilla bicolor bicolor and Eleotris (cave-adapted lineage noted).
- Community drivers (DistLM): Across all sites, site type (cave vs spring) explained the highest fitted variance (9%; P=0.002), followed by salinity (6.1%; P=0.041); longitude and dissolved oxygen were also identified (P=0.029 and P=0.052, respectively). Taxa richness per site did not differ between caves and springs (P=0.840).
- Within caves: Community dissimilarity driven by longitude and dissolved oxygen; cumulatively explained 21.7% of fitted variance.
- Within springs: Latitude and dissolved oxygen influenced composition, but not significantly (P=0.103 and P=0.298).
- Salinity associations: Freshwater/oligohaline sites characterised by Darwinulidae (Darwinula stevensoni), land crabs, Eleotris, Naididae; mesohaline by Artemia franciscana, Iophon sp., Caranx ignobilis; polyhaline (Thundercliff Cave) by Callyspongia, Melichthys niger, Agelas schmidti.
- Connectivity inference: Biotic and abiotic clustering indicated high possibility of interconnection for three pairs/groups: Whip Cave–The Grotto; Jones Spring–Waterfall Spring; Lost Lake Cave Sites 1–2. Additional medium-possibility clusters suggested among Jane-up, Jedda, WiFi Cave, Grants Well; CI-079–Hugh Dale Waterfall; Sepulchral Soil Sink–19th Hole.
- Environmental observations: Many cave sites showed tidal influence; detection of marine taxa at some springs near the coast suggests possible ocean connections with haloclines. Freshwater Spring had notably low dissolved oxygen (2.5 mg/L) yet fish taxa were detected.
The study demonstrates that eDNA metabarcoding applied to water and sediment from caves and springs can effectively detect multi-trophic eukaryotic subterranean assemblages, including elusive and troglofaunal taxa. Community composition differed strongly between cave and spring systems and was influenced by salinity gradients, consistent with anchialine and freshwater habitat differences. Within caves, longitudinal position across the island and dissolved oxygen further structured assemblages, indicating spatial turnover and local environmental filtering. Detection of marine taxa in coastal caves and even some springs highlights tidal/oceanic influence and potential haloclines, supporting hydrological connectivity to the sea. The results extend distribution records for key taxa (e.g., Anguilla bicolor bicolor, Eleotris), and suggest putative new occurrence records (e.g., Willowsia, various fishes), while acknowledging that incomplete reference databases limit species-level confirmation. Combining eDNA-derived community data with environmental parameters enabled inference of probable underground interconnections among sites (e.g., Whip Cave–The Grotto), corroborating or suggesting hydrological linkages that are otherwise difficult to assess with conventional methods. eDNA metabarcoding thus complements traditional biospeleological surveys by broadening detection across taxa and locations where access is challenging.
The use of eDNA sampling as a bioassessment tool in caves, where populations are sensitive and access is difficult, reduces impacts relative to traditional specimen-based surveys and enhances safety and logistics. Karst groundwater fluctuations across seasons should be considered when designing temporal surveys. eDNA metabarcarcoding can amplify target groups without taxonomic expertise, benefiting assessments in systems with many endemics and cave-adapted morphs. In this study, multi-marker eDNA metabarcoding of water and sediment from Christmas Island caves and springs revealed broad eukaryotic subterranean diversity, with community composition varying by site type and salinity, and cave assemblages further influenced by dissolved oxygen and longitudinal gradients. The study updated distribution information for taxa of biodiversity interest and potentially resolved previously unidentified specimens. Integrating biotic and abiotic data identified three site groups with a high likelihood of underground interconnection. Ongoing development of subterranean reference databases will further enable eDNA metabarcoding as a biospeleological survey tool for stygofauna and facilitate future research into food webs and ecosystem functioning in subterranean environments.
- Reference database incompleteness: Many subterranean eukaryotes lack barcodes (and formalin preservation of vouchers hinders sequencing), limiting species-level assignments and verification of putative new occurrence records.
- Sampling design: Six 1 L water replicates per site were insufficient to maximize observed richness; extrapolations suggest substantially more replicates are needed. Sediment was unavailable at two sites (Jedda, Jane-up) due to access restrictions.
- Temporal scope: Sampling occurred during a single late-dry-season campaign (October 2018); seasonal hydrology and water-level fluctuations in karst systems may alter community detections over time.
- eDNA interpretation: Detections may include DNA transported via percolating rainwater or tidal exchange, potentially detecting non-resident taxa; eDNA does not confirm live presence or abundance.
- Technical variation: Unbalanced reads in the 16S Crustacean assay likely reflect variable template concentrations across sites.
- Spatial coverage: Approximately 23 sites across the island; taxa accumulation across sites did not plateau, indicating additional sites would reveal more diversity.
Related Publications
Explore these studies to deepen your understanding of the subject.

