Introduction
Long-read sequencing is revolutionizing genomics, enabling the creation of high-quality reference genomes for a vast range of organisms. However, the process often requires preserving samples using methods like snap-freezing in liquid nitrogen, which can be impractical or impossible in many field settings, particularly in remote areas or low and middle-income countries. This poses a significant challenge to large-scale genomic initiatives, such as the Earth Biogenome Project and the Darwin Tree of Life Project, which aim to sequence the genomes of all known eukaryotic species. Current standard preservation techniques, involving immediate freezing in liquid nitrogen or dry ice followed by storage at -80°C, are not always feasible for field collections. While alternative storage methods, using solvents, buffers, or desiccation, have shown promise for short-read sequencing, their effectiveness in long-read sequencing applications remains largely untested, especially over extended storage times. This study directly addresses this gap by evaluating the impacts of various readily available and cost-effective field storage methods on the quality of DNA suitable for long-read sequencing using Oxford Nanopore Technology (ONT). Improving sample preservation techniques is crucial for advancing both crop improvement through pangenomic studies and the broader goal of creating high-quality reference genomes for all life on Earth. The ability to collect and store samples effectively in the field is essential for these ambitious endeavors, enabling scientists to study biodiversity in remote and challenging locations.
Literature Review
Existing literature highlights the critical role of high-molecular-weight (HMW) DNA in long-read sequencing. Standard methods, such as snap-freezing in liquid nitrogen or dry ice followed by storage at -80°C, are frequently employed but present significant logistical challenges in many field settings. Prior research has explored alternative preservation methods for DNA, with some demonstrating potential for long-read sequencing. However, most of these studies have focused on short storage times, insufficient for many field scenarios. The authors refer to a study (Dahn et al., 2022) which benchmarked ultra-high molecular weight DNA preservation methods for long-read and long-range sequencing, but acknowledges that these methods have not been comprehensively evaluated on major long-read sequencing platforms such as ONT or PacBio, or assessed over realistic field-relevant storage times.
Methodology
This study used a rigorous experimental design to evaluate the effects of preservation methods on DNA quality and sequencing results. Nine fish species (90 samples) and four plant species (36 samples) were collected and subjected to various storage conditions. Fish blood samples were preserved in three different ways: 95% ethanol (EtOH) at 4°C, 95% EtOH at 22°C, and RNAlater at 22°C. Liquid nitrogen (LN2) served as a control. Plants were preserved in 95% EtOH at 4°C, RNAlater at 22°C, and LN2. Samples were collected at six different time points: 0 days, 4 hours, 2 days, 1 week, 3 weeks, and 6 weeks for fish, and 0 days, 4 hours, 2 days, 1 week, and 3 weeks for plants. DNA was extracted from the samples using optimized protocols. DNA quality was assessed using various metrics including DNA yield (Qubit), purity (Nanodrop), and fragment size (Femto Pulse). Shallow sequencing was conducted on an ONT platform using the Native Barcoding Kit 96 V14 SQK-NBD114.96 to assess the impact of preservation on sequencing read quality and length (read N50). A subset of samples, representing the extreme storage time points, was selected for deep sequencing and genome assembly to assess the quality of the resulting genomes using metrics such as contig N50, BUSCO completeness, and quality values (QV). Methylation frequency was assessed for a subset of samples using ONT’s Dorado basecaller. Statistical analyses were performed using appropriate methods to account for non-normality and the experimental design. The authors also evaluated the impact of using ONT's short fragment eliminator (SFE) kit on the quality of sequencing data.
Key Findings
The results demonstrated that 95% EtOH is a highly effective preservation method for both fish blood (at 22°C for up to 6 weeks) and plant tissue (at 4°C for up to 3 weeks). DNA extracted from samples preserved in 95% EtOH showed consistently high yields, purity, and fragment lengths comparable to or even exceeding those from liquid nitrogen controls. Shallow sequencing data confirmed that 95% EtOH preserved samples yielded long reads with high quality scores, even after extended storage times. Deep sequencing and genome assemblies from samples stored in 95% EtOH for the maximum storage durations consistently resulted in high-quality genomes, meeting or exceeding Vertebrate Genome Project benchmarking standards for multiple species. For example, *K. azureus* (EtOH 22°C, 6 weeks) achieved a contig N50 length of 13.82 Mb and 98.6% BUSCO completeness. Similarly, *M. esculenta* (EtOH 4°C, 3 weeks) yielded a contig N50 length of 10.4 Mb and 99.2% BUSCO completeness. These results compare favorably with liquid nitrogen controls. The ONT short fragment eliminator (SFE) kit had a significant positive impact on sequencing N50. Methylation analysis revealed negligible differences in methylation frequency between samples stored in 95% EtOH or RNAlater and those stored in liquid nitrogen. Furthermore, DNA fragment length and read N50 length from shallow sequencing were strong predictors of assembly quality from deep sequencing.
Discussion
This study provides compelling evidence that 95% EtOH is a highly effective and practical alternative to liquid nitrogen for preserving samples intended for long-read sequencing. The use of 95% EtOH significantly simplifies sample preservation in field settings, reducing logistical challenges and costs. The ability to generate high-quality reference genomes from samples stored in 95% EtOH for extended periods opens up exciting possibilities for large-scale genomic projects focusing on diverse species in remote locations. The finding that shallow sequencing can accurately predict the outcome of deep sequencing is valuable for optimizing resource allocation in large-scale studies. The negligible impact of EtOH storage on methylation profiles expands the applicability of this method to epigenetic studies. The success achieved across different plant and fish species underscores the potential for broader application across various taxa.
Conclusion
This study demonstrates the effectiveness of 95% ethanol as a cost-effective and logistically convenient alternative to cryogenic storage for long-read sequencing of plant and fish samples. The high-quality genome assemblies obtained from samples stored for extended periods in 95% ethanol validate this approach for large-scale genomic projects and highlight the potential for broader application across other organisms and tissues. Future studies should focus on validating this method for other tissue types and organisms and explore the optimal parameters for different taxa and environments. Optimization of this method for particular circumstances might involve a change in storage temperature.
Limitations
While this study demonstrates the efficacy of 95% ethanol for preserving samples for long-read sequencing in plants and fish, there are limitations to consider. The study did not include replicates within each species, which could have strengthened the findings. The results are based on a specific set of plant and fish species and may not be generalizable to all taxa. Additional research is needed to determine the optimal preservation conditions for other tissue types, particularly plant tissues other than leaves. The long transit times may still affect some samples. The study focused primarily on DNA quality and did not evaluate the impact of storage conditions on RNA quality.
Related Publications
Explore these studies to deepen your understanding of the subject.