logo
ResearchBunny Logo
DNA storage in thermoresponsive microcapsules for repeated random multiplexed data access

Biology

DNA storage in thermoresponsive microcapsules for repeated random multiplexed data access

B. W. A. Bögels, B. H. Nguyen, et al.

Discover a groundbreaking method for multiplexed, random access to DNA files, utilizing thermoconfined PCR techniques. This innovative approach, developed by a team of expert researchers, significantly reduces amplification bias while enhancing storage scalability and efficiency.... show more
Introduction

The study addresses a key challenge in DNA-based archival data storage: scalable, parallel, and repeated random access to specific files without degrading pool integrity. While DNA offers exceptional information density and longevity, PCR-based random access consumes material and introduces amplification bias and chimeric artifacts, especially during multiplex retrieval. Prior approaches such as careful sequence design and added redundancy mitigate errors but increase cost; physical separation via multiple singleplex PCRs scales poorly; and emulsion PCR reduces crosstalk but is cumbersome, non-reusable, and solvent-intensive. The authors hypothesize that compartmentalizing DNA files within thermoresponsive, semipermeable microcapsules and exploiting temperature-controlled permeability can confine PCR reactions, reduce molecular crosstalk and bias, and enable repeated random access by retaining original DNA files within reusable compartments.

Literature Review

The paper builds on advances in DNA synthesis (chemical and enzymatic) and high-throughput sequencing that make large-scale DNA data storage feasible. Coding schemes have achieved high density (up to 17 exabytes per gram). However, PCR-based random access suffers from bias due to sequence properties (length, GC content, secondary structure) and from chimera formation when similar regions recombine, degrading file fidelity. Prior mitigations include sequence design and redundancy, but at increased cost. Emulsion PCR effectively segregates templates to suppress artifacts and has been used for data retrieval but requires complex workflows and is non-reusable. Alternative retrieval strategies include physical separation, hybridization-based selection, nested PCR, and fluorescence-assisted sorting with barcoded microcapsules. Proteinosomes—protein–polymer microcompartments—have previously been used to localize biotinylated DNA via streptavidin but needed improved thermal stability for PCR conditions.

Methodology

The authors introduce thermoconfined PCR using proteinosomes—semipermeable microcapsules formed by crosslinking BSA–PNIPAm conjugates at water-in-oil interfaces and phase-transferring into water. The membranes exhibit temperature-dependent permeability due to PNIPAm’s LCST behavior: high permeability below LCST and reduced permeability above LCST. To stably capture DNA, the lumen is loaded with Tamavidin 2-HOT, a thermostable streptavidin analogue that maintains biotin binding at PCR temperatures. Biotinylated DNA files (fixed-length sequences with primers) are localized inside proteinosomes via biotin–Tamavidin interactions. Magnetic particles are co-encapsulated to enable magnetic recovery of proteinosomes after reactions. Temperature-dependent permeability is characterized by confocal microscopy using fluorophore-labeled ssDNA and dsDNA constructs, revealing reduced permeability to longer ssDNA at 95 °C and reversible reopening at room temperature. Enzymatic accessibility of localized templates is validated by qPCR and strand displacement amplification (SDA) with EvaGreen fluorescence readout. To assess reduction of molecular crosstalk, two templates sharing a 31-nt region are amplified in multiplex; chimera formation is quantified by native PAGE and band intensity analysis. For scalability, twenty-five 1 MB files (≈66,000 sequences per file; 110-nt constructs with common primer scheme) are localized in separate proteinosome populations, pooled, and amplified via multiplex PCR under three conditions: bulk, water-in-oil emulsion droplets, and proteinosomes-in-water. Per-file abundances are quantified by qPCR and Illumina sequencing (TruSeq Nano prep, NextSeq), with coverage analysis using BWA and SAMtools. Repeated random access is tested using three files across four sequential PCR rounds, comparing bulk, emulsion, and proteinosome workflows; proteinosome libraries are magnetically recovered between rounds, whereas emulsions are broken and purified, and bulk reactions are treated to inactivate primers/dNTPs. Sequence dropout and coverage CV are quantified from Illumina data. Fluorescence-based retrieval is implemented by barcoding proteinosomes using membrane dyes (FITC, DyLight 405) and localized short biotinylated fluorescent ssDNAs (Cy3, Cy5), enabling 2^N barcode combinations. Pooled barcoded proteinosomes are sorted by FACS (BD Aria III) using sequential gating (FSC-A/H, membrane dyes, then Cy3/Cy5), followed by qPCR to assess file enrichment. Stability for archival storage is probed by lyophilization of DNA-containing proteinosomes with trehalose as lyoprotectant; rehydration integrity is evaluated by microscopy and qPCR. Statistical analyses use triplicates with Welch’s t-tests and ANOVA/Tukey as appropriate.

Key Findings

• Thermoresponsive retention and permeability: Tamavidin 2-HOT proteinosomes stably retain biotinylated dsDNA after heating to 95 °C, unlike streptavidin controls. Membrane permeability decreases at high temperature, limiting diffusion of longer ssDNA at 95 °C while shorter ssDNA (31 nt) diffuses rapidly; permeability is restored upon cooling, enabling recovery of amplicons at room temperature. Proteinosome size was 57 ± 17 μm. • Enzymatic access: qPCR of localized templates showed a 4.4-cycle lower threshold for biotinylated dsDNA versus non-biotinylated control (p=0.049), implying ~21× greater accessible template. SDA of localized templates yielded an 8.6× higher production rate for biotinylated versus unlabelled DNA (p=0.004). • Reduced chimera formation: In a two-template multiplex PCR designed to form a 71-bp chimera, proteinosome confinement significantly lowered chimera-to-target ratios compared with bulk (1.8-fold reduction; p=0.016), while producing comparable amounts of 178-bp targets. • Multiplex scaling (25 files, 25 MB total): After multiplex PCR, per-file coverage spread (max/min) was 60-fold for bulk, 5-fold for emulsion, and 7-fold for proteinosomes; the original pool had a 3-fold spread. Coverage CVs: original 24%, bulk 139%, emulsion 35%, proteinosomes 52%. • Repeated random access: Over four rounds accessing three files, proteinosomes showed the lowest sequence dropout in the final file, followed by emulsion and then bulk. An example change from 36.10% dropout (bulk) to 0.91% (proteinosomes) increased coding density by 1.56×. Coverage CVs across rounds: mean bulk 219%, emulsion 96%, proteinosomes 69%. • Fluorescence-assisted retrieval: FACS of barcoded proteinosomes achieved enrichment of intended files to an average 75.0% of DNA in sorted samples, with non-target files averaging 8.4% each; 8.4-fold selectivity (p=0.00096). The barcoding scheme supports up to 2^N combinations. • Lyophilization: Trehalose-assisted freeze-drying preserved DNA integrity and localization upon rehydration, with no observed proteinosome coalescence.

Discussion

The thermoconfined PCR approach addresses PCR-induced bias and chimera formation by physically segregating DNA templates within microcompartments that dynamically restrict permeability at high temperatures. This reduces molecular crosstalk during amplification, leading to more proportional multiplex file retrieval and lower dropout during repeated access compared with bulk PCR. Performance approaches that of emulsion PCR while enabling a key advantage: the original file-encoding molecules remain localized and recoverable, allowing repeated reads without re-encapsulation or solvent-intensive workflows. Fluorescent barcoding demonstrates orthogonal, metadata-based retrieval compatible with PCR access, enabling pooled libraries to be sorted by content labels. Together, these results advance practical DNA data storage by improving fidelity and enabling scalable, repeated random access in a reusable, sequence-agnostic format. The method also integrates with archival needs via successful lyophilization.

Conclusion

Thermoresponsive proteinosomes with internal Tamavidin 2-HOT provide a reusable platform for DNA data storage that enables multiplex and repeated random access via thermoconfined PCR. The system significantly reduces chimera formation and amplification bias, preserves per-file distribution closer to emulsion PCR, and enables low-dropout repeated reads. Fluorescence-based barcoding coupled with FACS offers an additional search and retrieval modality. The platform is compatible with lyophilization for archival storage. Future work includes improving data density by accessing single proteinosomes, minimizing loss during fluorescence-based sorting, accelerating and automating initial localization, and assessing long-term stability of dried proteinosomes through accelerated aging studies.

Limitations

• Chimera suppression, while significantly improved, is not complete; residual chimeras may arise from incomplete removal of unlocalized DNA or limited release of amplicons. • Fluorescence-assisted sorting (FACS) incurs sample loss, cannot retrieve single compartments, and achieved ~75% target enrichment with non-zero mis-sorting. • Current demonstrations access data from many proteinosomes at once, lowering practical data density; single-compartment retrieval was not implemented. • Initial localization into proteinosomes is relatively time-consuming compared with bulk workflows, though reusability mitigates this for repeated access. • Dependence on specialized components (Tamavidin 2-HOT, PNIPAm-based proteinosomes) and instrumentation (FACS, confocal microscopy) may limit immediate adoption. • Proteinosome size heterogeneity may influence diffusion and reaction kinetics.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny