Environmental Studies and Forestry
Inadvertent human genomic bycatch and intentional capture raise beneficial applications and ethical concerns with environmental DNA
L. Whitmore, M. Mccauley, et al.
Environmental DNA (eDNA) enables non-invasive, cost-effective monitoring across ecosystems and is increasingly used for biodiversity assessment and pathogen surveillance, including successful wastewater monitoring for SARS-CoV-2 and other pathogens. Historically, eDNA studies have relied on targeted qPCR and metabarcoding, but advances in sequencing and bioinformatics now make shotgun metagenomics feasible, capturing a wider spectrum of genetic diversity. The authors hypothesize that an unintended consequence of shotgun eDNA approaches is the recovery of human genomic sequences—human genetic bycatch (HGB)—with potential to include identifiable and phenotypically informative variants. This raises ethical and legal issues around consent, privacy, data ownership and regulatory oversight, especially as human DNA is not typically the target of eDNA studies. The study investigates whether human genomic DNA can be recovered from wildlife-focused eDNA datasets, quantifies human eDNA in environmental samples from sites with varying human presence (water, sand footprints, room air), and evaluates intentional recovery and enrichment approaches to reconstruct informative human haplotypes and variants from environmental substrates.
The paper situates its work within rapid growth of eDNA applications for biodiversity monitoring, invasive species detection, and pathogen surveillance from diverse media (air, soil, sediments, water, permafrost, snow, ice cores). Traditional targeted methods (qPCR, metabarcoding) provide limited, biased snapshots, whereas untargeted shotgun sequencing offers broader, less biased recovery across taxa and is becoming cost-effective. Wastewater genomic surveillance developed rapidly during COVID-19 and has been applied to other pathogens (monkeypox, polio, tuberculosis). The authors note that as shotgun sequencing becomes widespread, significant volumes of human genomic data may be captured inadvertently. Ethical frameworks for human genomic data (informed consent, privacy, data ownership) exist but may be insufficient in the eDNA context, where sampling is often non-invasive and does not target humans. The literature also highlights contamination issues in sequencing data and emerging individual-level eDNA applications in wildlife, underscoring the plausibility and implications of human identifiability from eDNA. The paper also references discussions in conservation regarding inadvertent capture of non-genetic human data (images, audio) and proposals like codes of conduct and data filtering, drawing parallels to HGB challenges.
Study design comprised three components: (1) retrospective analysis of wildlife/pathogen-focused eDNA shotgun datasets for human-aligning reads; (2) intentional collection and qPCR quantification of human eDNA from sites with low/high human presence across substrates (water, sand footprints, room air) in Florida (USA) and Ireland; and (3) sequencing (long-read nanopore shotgun and short-read Illumina with human exome enrichment) to assess genomic content, structural variants, and mitochondrial haplogroups.
- Sample collection: HGB samples included seawater (rehabilitation tank and oceanic) and beach sand collected 2017–2021 in Florida for sea turtle/pathogen studies. Intentional human-focused samples (May–July 2022) included river/estuarine/seawater in Florida and Ireland, beach sand from human footprints vs restricted-access no-human site, and room air from a clinical setting. Negative field controls (NFCs) were collected and processed alongside study samples for water, sand, and air (filters in-room without vacuum and air from rooms with no humans in previous 24 h). Sampling avoided direct contact with substrates; strict contamination controls included PPE, bleach/ethanol decontamination, UV treatment, and processing in non-human labs. Human-related sampling had IRB approval (IRB202201336) with informed consent.
- Filtration and extraction: Water and air were filtered through 0.22 µm Sterivex-GP units; sand was washed with TE, supernatant filtered similarly. Lysis with Qiagen DNeasy Blood and Tissue Kit buffers (ATL + Proteinase K) and overnight incubation (56 °C) for water/sand; 1 h for air. DNA was purified per modified DNeasy protocol and eluted in AE buffer.
- qPCR assays: Human-specific TaqMan assays targeting LILRB2 (Hs01629548_s1) and ZNF285 (Hs00603276_s1) with no cross-reactivity across 27+ species and validated no amplification in green/loggerhead sea turtle DNA. Pan-eukaryotic 18S rRNA assay used for Irish samples. Sea turtle-specific qPCR (16S rRNA) used to quantify turtle eDNA from air. qPCR run on QuantStudio platforms with standard cycling; technical replicates per sample; boxplot visualization with Tukey whiskers.
- Sequencing: Illumina shotgun (HiSeq 3000 for four 2017 water samples; NovaSeq 6000 for subsequent water and sand) performed at UF ICBR. Oxford Nanopore MinION long-read shotgun sequencing for selected 2022 intentional human-positive samples and negative controls, using ONT ligation kits; run metrics in Supplementary Table 3. Human exome enrichment: Illumina DNA Prep with Enrichment Exome Panel (45 Mb) on five eDNA samples (three human-positive, two NFCs), sequenced on NovaSeq 6000 S4 (2x150 bp), targeting ~50M reads per sample.
- Bioinformatics: Quality control (FastQC), trimming (Trim Galore! for Illumina; Porechop for ONT). Alignments to human genomes: Bowtie2 (hg38) for Illumina; minimap2 to T2T-CHM13v1.1 and GRCh38/GRCh37 for ONT. Y-chromosome read quantification via StringTie and samtools idxstats. Structural variant calling with ONT EPI2ME Sniffles pipeline (minimap2 alignment; Sniffles; minimum 20 bp SV length; multisample support). Additional high-sensitivity SV calling using Sniffles v2 with non-germline and minsupport=1 on GRCh37 alignments; CNV matching to gnomAD SV v2.1 within ±5% breakpoint windows; manual IGV verification. Mitochondrial haplogrouping with MitoMaster/HaploGrep (Phylotree 17); pathogenicity via MitoTip/ClinGen. Metagenomics with EPI2ME What's in My Pot (Centrifuge + Dustmasker). Visualization in R (ggplot, chromoMap, webr). Data deposition: Illumina (PRJNA449022) and ONT (PRJNA874696).
- Human genomic reads were present in all wildlife/pathogen-focused shotgun eDNA samples from water and sand, confirming pervasive human genetic bycatch (HGB). Total of ~1.8 million paired-end human-aligning reads (300 bp per read pair) recovered across datasets.
- In several wild water samples, human-aligning read abundance approached that of the target green sea turtle: average green turtle RPTM to human RPTM ratio 1.39 (min 1.03, max 2.54). Rehabilitation tank and sand samples were turtle-dominated (ratios 18.50 and 63.50, respectively).
- Human Y-chromosome reads were detected in all samples despite its small size and absence in target species, supporting genuineness of human signal; estuarine samples showed higher Y chromosome RPTM than oceanic.
- Intentional sampling and qPCR: Human eDNA was readily quantified in water from populated areas in Florida and Ireland; minimal/none upstream of habitation (Goldmines River tributary, private well) and off-shore ocean beyond inlet. Within Arklow town (Ireland), human eDNA increased through town and diluted at river mouth, despite lower filtered volumes due to turbidity. Pan-eukaryotic DNA was present across Irish samples.
- Sand: No human eDNA detected in restricted-access Rattlesnake Island sand, while human footprints yielded robust human eDNA by qPCR.
- Air: No human eDNA in negative field controls or rooms without humans; robust detection in rooms with humans engaged in normal activities, even in a sterile veterinary hospital.
- Nanopore shotgun sequencing of seven samples confirmed genomic human reads in intentional human-positive eDNA from water, sand footprints, and room air (thousands to hundreds of thousands of human-aligning reads), while negative controls and no-human sand site had only 2–26 reads. Long single human reads up to 148,969 bp (air), 120,998 bp (water), and 39,229 bp (sand) were recovered; longest mitochondrial read 16,535 bp (~full mitogenome). Average read length across human-positive ONT samples was 1,514 bp.
- Structural variation and disease-associated loci: Known gnomAD deletions were detected across substrates, especially water; longest single-read deletion 40,738 bp (chr2, common CNV in European/Latino populations). Deletions intersected genes involved in neuronal differentiation, DNA repair, proteolysis, and cancer-associated genes (ALK, LIN28B, PDGFD, WNT7B). Other SV types (insertions, duplications) were also detected. Seven mitochondrial variants with established pathogenic associations were found (six water, one air) related to conditions including autism, diabetes, ocular and cardiac diseases.
- Exome enrichment markedly increased human content in human-positive samples: post-capture, water eDNA had 48% human-aligning reads (20.72 billion human bases), achieving ~473× coverage of targeted human exome regions; negative controls remained near-background.
- Population genomics: Mitochondrial haplogroups/haplotypes could be inferred from all intentional human-positive samples and bycatch samples with higher human content. European-Indian haplotypes, notably H2a2a1, were common. Footprint and room-air haplotypes matched known participant profiles; water-sample haplotypes matched local demographics.
- Metagenomic breadth: Air eDNA sample from a sea turtle hospital simultaneously detected human, green turtle DNA, and pathogenic turtle herpesvirus/papillomavirus reads, illustrating multi-taxa surveillance potential.
The study demonstrates that shotgun eDNA workflows inherently co-capture human genomic material (HGB) at levels sufficient for downstream analyses including structural variants, mitochondrial haplotypes, and disease-associated alleles. This validates the hypothesis that as eDNA moves toward untargeted deep sequencing, human genomic data will be retrieved alongside non-human biodiversity and pathogen signals. The findings underscore both opportunity and risk: intentional human eDNA capture from water, sand, and air can enable beneficial applications such as public health surveillance, environmental monitoring of effluent, forensics, search and rescue, archaeology, and healthcare-adjacent diagnostics; yet the same capabilities raise significant ethical concerns involving consent, privacy, surveillance, data ownership, public data deposition, and inadvertent location tracking or genome harvesting. The detection of Y-chromosome reads, long contiguous human reads, and successful exome enrichment indicate feasibility for high-resolution human genomic inference from environmental samples, including ancestry and potentially individual-level identification with targeted enrichment. Consequently, eDNA projects—even when not targeting humans—may require human-subjects ethical oversight and possibly data filtering or blocking strategies prior to open data deposition. The work calls for proactive development of governance frameworks, codes of conduct, and technical mitigations to manage HGB while enabling beneficial applications.
This paper establishes that human genomic bycatch is ubiquitous in shotgun eDNA datasets and that intentional capture of high-quality human eDNA from water, sand, and air is feasible, enabling recovery of long reads, structural variants, disease-associated mitochondrial variants, and population haplotypes. Exome enrichment further amplifies human genomic signal from environmental samples. These results open avenues for beneficial human-focused eDNA applications in public health, forensics, environmental protection, archaeology, and continuous health monitoring. However, they also expose ethical challenges around consent, privacy, surveillance, data ownership, and data sharing. The authors urge immediate engagement among regulators, researchers, journals, repositories, and communities to develop policies, ethical guidelines, and technical practices (for example, human-read filtering or targeted blocking) that balance innovation with rights protection as eDNA technologies scale. Future work should refine capture and enrichment methods, assess individual identifiability thresholds, expand diverse geographic and environmental sampling, and co-develop ethical frameworks with stakeholders, including Indigenous communities.
- Although stringent contamination controls were implemented, low levels of human-aligning reads were present in some negative field controls, indicating background signal cannot be entirely eliminated.
- Sampling sites for wildlife-focused bycatch analyses were not adjacent to dense urban centers; findings may differ in heavily urbanized or distinct environmental contexts.
- Individual-level human identification was not attempted; while long reads and exome enrichment suggest feasibility, confirmatory studies are needed to determine identifiability thresholds from pooled environmental samples.
- Exome enrichment and nanopore analyses were performed on a limited number of intentional samples; broader replication across substrates, seasons, and geographies would strengthen generalizability.
- The study focuses on mitochondrial haplotyping for population inference; nuclear population structure inference from eDNA warrants further development.
- Ethical, legal, and social implications discussed are forward-looking; the study does not experimentally evaluate mitigation strategies (for example, human read-blocking efficacy, automated filtering) or policy interventions.
Related Publications
Explore these studies to deepen your understanding of the subject.

