logo
ResearchBunny Logo
Generating high-quality plant and fish reference genomes from field-collected specimens by optimizing preservation

Biology

Generating high-quality plant and fish reference genomes from field-collected specimens by optimizing preservation

J. J. Minich, M. L. Moore, et al.

Discover how researchers Jeremiah J. Minich and colleagues explore innovative sample preservation methods that enhance DNA quality for Oxford Nanopore long-read sequencing. Their findings could revolutionize your approach to genomic studies of plants and fish, making it more accessible than ever.

00:00
00:00
~3 min • Beginner • English
Introduction
Long-read sequencing enables high-contiguity genome assemblies across taxa and underpins efforts in evolution, conservation, natural product discovery, and improvement of crops and livestock. Global initiatives such as the Earth BioGenome Project and Darwin Tree of Life aim to produce reference genomes for all named eukaryotes, but practical bottlenecks remain for collecting and preserving high molecular weight DNA, especially outside well-resourced labs. Standard cryogenic methods (liquid nitrogen or dry ice followed by −80 °C storage) are often impractical in remote and low-resource settings. Alternative preservation solutions have shown promise for short-read sequencing and in theory for long reads, but have not been benchmarked on major long-read platforms over realistic field-relevant storage times. This study tests the hypothesis that readily available, low-cost preservatives and non-cryogenic conditions can maintain DNA integrity sufficient for high-quality ONT long-read sequencing and assembly in representative plant and fish tissues. The objective is to quantify the effects of storage solution, temperature, and duration on DNA/sequence quality and to demonstrate feasibility by assembling reference-quality genomes from the most extreme storage conditions.
Literature Review
Prior work suggested solvent, buffer, or desiccation-based preservation methods may maintain DNA quality for sequencing, but benchmarking for long-read platforms (ONT or PacBio) was limited and generally assessed only at short storage times (hours), which are not representative of remote field scenarios. High standards for reference genomes (e.g., VGP) emphasize long reads and high base accuracy, yet field-compatible preservation protocols that consistently deliver such data quality have not been validated across diverse tissues and taxa. This study addresses these gaps by systematically comparing ethanol and RNAlater across temperatures and multi-week time courses and by directly evaluating outcomes on ONT sequencing and assembly metrics.
Methodology
Study design: The study evaluated storage solution (95% ethanol and RNAlater), storage temperature (4 °C and 22 °C) and storage duration (fish: 0, 1, 3, 6 weeks; plants: 0, 4 hours, 2 days, 7 days, 21 days) on DNA quality and ONT sequencing/assembly outcomes. Fish (n=9 species; 90 samples) blood and plant (n=4 species; 36 samples) leaf tissues were collected, with liquid nitrogen (LN2, −80 °C) controls included. Fish were preserved in 95% EtOH at 4 °C, 95% EtOH at 22 °C, and RNAlater at 22 °C, plus LN2 controls. Plants were preserved in 95% EtOH at 4 °C and RNAlater at 22 °C, plus LN2 controls; limited tissue precluded all combinations. Sampling: Marine fish were collected under IACUC protocol S12219; blood was drawn from the caudal vein, placed into EDTA K2 tubes on ice, and aliquoted into preservation solutions (95% EtOH or RNAlater) or frozen for controls. Plants (Manihot esculenta, Sorghum bicolor, Zostera marina, Phyllospadix torreyi) were collected locally; young leaves were cut into 1 cm segments and allocated to storage treatments (LN2, EtOH on ice/refrigerated, or RNAlater at ambient). EtOH samples were stored at 4 °C; RNAlater at room temperature, in the dark. DNA extraction: Fish HMW DNA was extracted using the NEB Monarch HMW DNA kit for cells and blood, following 'fresh nucleated blood' for buffer-stored samples and 'frozen nucleated blood' for controls, with lysis at 2000 RPM, 100 µl elution, and minimal pipetting to avoid shearing. Plant HMW DNA was extracted using the ONT plant CTAB-based protocol (Qiagen Blood and Cell Culture DNA Midi kit components), adapted to half reactions (20 mL lysis buffer), with a 3 h isopropanol precipitation and 100 µl elution. DNA QC: After at least one week at 4 °C to allow solubilization, DNA yield was quantified by Qubit (BR, with HS for low-concentration samples) and purity by Nanodrop (A260/280, A260/230) using top/middle/bottom sampling. Fragment size was assessed by (i) coefficient of variation of Qubit measurements and (ii) Femto Pulse (mean length and % DNA >50 kb via smear analysis). Shallow ONT sequencing: Libraries (Kit 14, Native Barcoding Kit 96 V14 SQK-NBD114.96) were prepared with 800 ng (fish) or 400 ng (plant) input. Multiplexed pools (90 fish or 29 plant) were sequenced on PromethION R10.4.1 flow cells with high-accuracy basecalling (MinKnow), and NanoPlot metrics (read N50, read quality, yield) were collected. Most plant marine samples at later time points with insufficient DNA were excluded. Size selection test: For fish libraries (n=90), the ONT Short Fragment Eliminator (SFE) kit (miniaturized protocol) was tested to assess improvement in read N50. Deep ONT sequencing and assembly: Selected extreme-condition samples were deeply sequenced using ONT SQK-LSK114 with SFE size selection for fragments >25 kb, R10.4 flow cells, and SUP basecalling. Assemblies were generated with Flye 2.9 and polished with ONT reads via Racon and with Illumina reads via Pilon. Assembly metrics (contig N50, contig counts), completeness (BUSCO), and base accuracy (Merqury QV with and without Illumina polishing) were computed. Species deeply sequenced: fish—Kyphosus azureus (LN2 0 wk; EtOH 22 °C 6 wks; RNAlater 22 °C 6 wks), Medialuna californiensis (EtOH 22 °C 6 wks), Girella nigericans (EtOH 22 °C 6 wks); plants—Manihot esculenta (LN2 0 wk; EtOH 4 °C 3 wks; RNAlater 22 °C 3 wks), Sorghum bicolor (EtOH 4 °C 3 wks). Methylation calling: For K. azureus and M. esculenta, raw FAST5 were converted to POD5 and processed with Dorado v0.2.4 (SUP model for R10.4.1, 260 bps) to call CpG methylation against sample-specific assemblies; modkit piled-up calls to compute genome-wide methylation frequencies. Statistics: Non-parametric tests were used due to failed normality (Shapiro–Wilk). For fish, one-way Friedman tests with repeated measures across time points per storage condition, with Benjamini–Krieger–Yekutieli correction (alpha 0.05). For plants, treatments compared across five time points. SFE effects were evaluated by Wilcoxon matched-pairs signed-rank test. Correlations between DNA fragment size, shallow read N50, deep read N50, assembly N50, and coverage were assessed by linear models.
Key Findings
- Preservation efficacy: - 95% ethanol preserved fish blood DNA integrity at 22 °C for up to 6 weeks and plant leaf tissue at 4 °C for up to 3 weeks, enabling high-quality ONT sequencing and assemblies. - RNAlater at 22 °C showed declining performance over time, particularly for plants; fish were more robust, but extended RNAlater storage reduced read N50. - DNA yield and purity: - Fish blood yields remained stable or higher than LN2 controls across conditions to 6 weeks. - Plant tissue showed more variability; EtOH 4 °C yields remained consistent until week 3; RNAlater 22 °C yields decreased significantly over later timepoints (P=0.0071, F_S=11.40). A260/230 was suboptimal at weeks 1 and 3 in plants. - Fragment size and sequencing quality: - Fish: EtOH 4 °C matched LN2 controls through 6 weeks. EtOH 22 °C had more variation with average length differing from control (P=0.0216, F_S=9.667); weeks 3 and 6 trended lower. Plants: EtOH 4 °C retained size up to 3 weeks; RNAlater 22 °C degraded at later timepoints though 4-hour RNAlater sometimes exceeded controls. - Shallow ONT read N50 for fish was stable in EtOH (4 °C, 22 °C) across time; RNAlater 22 °C decreased with time. In plants, read N50 decreased over time in both EtOH 4 °C and RNAlater 22 °C (both P=0.0417, F_S=7.600). Read quality was stable or improved in EtOH; plant RNAlater quality declined by day 2. - Read N50 correlated with library read quality (both fish and plants) and, in plants, with DNA yield and fragment size. - Size selection (ONT SFE): Increased read N50 in 63.3% of fish libraries by an average of 27% (P=0.0443); maximum increase 318%; 15.5% of libraries at least doubled N50. Benefits greatest for highly fragmented libraries; could reduce N50 for already long-read libraries. - Deep sequencing and assemblies (Table 1 excerpts): - Kyphosus azureus: LN2 control—contig N50 21.8 Mb, BUSCO C 98.8%, QV.r 40.2, QV.hy 42.0, methylation 79.14%; EtOH 22 °C 6 wks—N50 13.8 Mb, C 98.6%, QV.r 40.3, QV.hy 42.4, methylation 78.66%; RNAlater 22 °C 6 wks—N50 5.1 Mb, C 98.7%. - Medialuna californiensis (EtOH 22 °C 6 wks): N50 6.5 Mb, C 98.7%, QV.r 36.2, QV.hy 37.9. - Girella nigericans (EtOH 22 °C 6 wks): N50 9.7 Mb, C 98.5%, QV.r 38.0, QV.hy 40.1. - Manihot esculenta: LN2—N50 16.5 Mb, C 99.0%, QV.r 47.5, QV.hy 49.3, methylation 77.63%; EtOH 4 °C 3 wks—N50 10.4 Mb, C 99.2%, QV.r 43.8, QV.hy 44.6; RNAlater 22 °C 3 wks—N50 0.4 Mb, C 99.0%, QV.r 32.1, QV.hy 32.5. - Sorghum bicolor (EtOH 4 °C 3 wks): N50 5.9 Mb, C 94.6%, QV.r 44.3, QV.hy 46.4. - Base accuracy (QV): With ONT-only assembly/polishing, 5/9 genomes achieved VGP QV>40; Illumina polishing increased QV by 0.4–2.3 (mean 4.15%), yielding 7/9 with QV>40. Lower QVs corresponded to samples with the shortest read N50s (e.g., M. californiensis EtOH 6 wks; M. esculenta RNAlater 3 wks). - Methylation: Genome-wide CpG methylation frequencies were similar between solvent-stored samples and LN2 controls. K. azureus: EtOH slightly lower than LN2; RNAlater slightly higher. M. esculenta: EtOH and RNAlater slightly lower than LN2. Coefficient of variation was higher in RNAlater than EtOH, but overall differences were small, indicating minimal impact of storage on methylation profiling. - Predictors of assembly quality: DNA fragment size positively predicted deep read N50 (P=0.0141, R^2=0.6031) and assembly contig N50 (P=0.0058, R^2=0.6860). Shallow read N50 predicted deep read N50 (P=0.0174, R^2≈0.5783) and assembly contig N50 (P=0.0220, R^2≈0.5507). Deep read N50 most strongly predicted assembly N50 (P=0.0002, R^2=0.8709). Coverage within the tested range did not predict assembly N50.
Discussion
The study demonstrates that non-cryogenic preservation—particularly 95% ethanol—can maintain high molecular weight DNA suitable for ONT long-read sequencing and reference-quality assemblies from field-collected fish blood and plant leaf tissue over realistic logistical timeframes. The results directly address the key obstacle of DNA degradation during transport and storage, showing that EtOH at ambient temperature for fish (up to 6 weeks) and refrigerated EtOH for plants (up to 3 weeks) preserves DNA sufficiently to achieve high contiguity (contig N50 often >6–10 Mb) and high completeness (BUSCO >94–99%), meeting or approaching VGP standards. RNAlater is less robust over extended times, especially in plants, but can still yield usable fish assemblies up to six weeks where alcohol is restricted. The strong correlations between DNA fragment size, shallow sequencing read N50, deep read N50, and assembly N50 provide a practical QC framework: shallow multiplexed ONT runs can reliably predict assembly outcomes and guide sample triage without specialized sizing instruments. The SFE kit can substantially improve read length distributions for fragmented samples, offering a salvage pathway. Methylation analyses indicate preservation method minimally affects CpG methylation estimates, supporting epigenetic applications from solvent-preserved samples. Collectively, these findings expand feasible sampling strategies in remote or resource-limited settings, reducing dependence on cryogenics while maintaining data quality for genome assembly and methylome profiling.
Conclusion
Practical, field-friendly preservation using 95% ethanol enables high-quality ONT sequencing and assemblies from fish blood at room temperature for up to six weeks and from plant leaves under refrigeration for up to three weeks. Multiple assemblies from extreme storage conditions achieved high contiguity, completeness, and base accuracy, and methylation profiling was largely unaffected. The work provides validated protocols and a QC workflow where shallow ONT sequencing predicts assembly success, alongside evidence that ONT-only assembly can meet QV benchmarks in favorable cases, with further improvements from Illumina polishing. Future research should assess broader taxonomic coverage (other vertebrate blood sources), additional plant tissues (stems, roots, lignified organs), marine/aquatic plants with anatomical barriers to solvent penetration, and optimization of storage durations and temperatures across diverse metabolites and tissue matrices. Expanded replication within species and increased sequencing depth for challenging samples may further improve contiguity and accuracy.
Limitations
Biological replicates were at the species level (fish n=9, plants n=4) without additional within-species replication due to cost, limiting inference about intraspecies variability. Some plant samples (particularly marine plants) had insufficient DNA or poor quality at later time points and were excluded, indicating variability across taxa and tissue types. RNAlater showed reduced performance for extended storage, especially in plants. Storage recommendations may not generalize to all tissues (e.g., lignified stem/root) or taxonomic groups not tested. Differences in field collection and transit prior to preservation (e.g., M. californiensis) may influence outcomes. The assembly contiguity and QV for some long-stored samples were lower, suggesting that additional sequencing and fragment filtering might be required to reach top-tier standards.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny