Biology
Molecular robotic agents that survey molecular landscapes for information retrieval
S. Woo, S. K. Saka, et al.
The study addresses how to autonomously survey and record spatial information within complex molecular environments at nanoscale resolution without destructively consuming probes. Building upon advances in DNA-based molecular motors and programmable DNA systems, the authors propose molecular robotic ‘agents’ (crawlers) that traverse DNA-labeled targets in situ, copying identifiers from nearby probes to generate records that reflect trajectories and proximities. The purpose is to enable quantitative analyses such as counting subunits within complexes and detecting multivalent protein colocalizations, overcoming limitations of prior methods that were destructive, required thermocycling or manual handling, or were limited to pairwise interactions. This capability is important for refining spatial transcriptomic/proteomic studies and for mapping spatiotemporal molecular interactions inside cells.
Prior DNA-based motors progressed from manually operated and autonomous walkers to task-performing robots (cargo transport, assembly, sorting) and to systems enabling synthesis, signal amplification, and drug release. DNA-based spatial reconstruction approaches via amplicon diffusion or colonies demonstrated encoding of spatial information but required thermocycling or manual handling and achieved cellular-to-micrometer resolutions. Proximity-based assays (ligation, extension, proximity imaging, HCR, stamping templates, nicking) provided proximity detection but were largely destructive (single-use probes), risking dead spots and incomplete analyses. Some clever designs activated surrounding spots or rescued unligated sites but remained destructive. Non-destructive proximity recording methods enabled repeated measurements but typically only captured pairwise interactions, limiting analysis of multicomponent biology. DNA and enzymes have also enabled programmable computation, reaction networks, directional/rotational motors, and concatenation of sequences; microscopic DNA-enzyme agents enabled reaction networks but not molecular-resolution information transfer beyond fluorescence signals. These gaps motivate an autonomous, non-destructive, and scalable approach that can record multivalent proximities and quantitative features directly in situ.
System design: Each surface- or target-bound DNA probe comprises a primer-binding (PB) domain and a copy-and-release (CR) domain. A primer binds the PB domain and is extended by a polymerase through the CR domain. The CR domain is a double-stranded motif enabling isothermal copying of an embedded sequence and spontaneous release of the copied segment upon extension to a ‘stopper’ (e.g., iso-dC/iso-dG or a chemical linker). The CR domain contains a barcode segment (identifier/spacer) and a primer-encoding (PE) segment that encodes the next primer. Upon extension, the newly generated primer can bind a neighboring probe’s PB domain, enabling the crawler to step probe-to-probe. A separate ‘release primer’ binds the exposed complement and is extended to release the double-stranded record into solution and reset probes to their initial state, enabling non-destructive, catalytic, repeated recording and record retrieval from the supernatant. Demonstrations on DNA origami: Tracks with three and ten prescribed probe positions were built on DNA origami deposited on mica for AFM imaging. Crawlers were initiated, grown, and imaged before and after reactions; records were retrieved and analyzed by PAGE, qPCR, and Sanger sequencing to confirm lengths and sequences. Random crawling: A universal probe with a tandem PB domain (e.g., a and b″) and repeating primers was designed so initiation (with primer a), stepping (with primer b), and release (with b″) can occur at any site, enabling true random crawling among proximal probes. Counting subunits: Counting exploits the maximum number of steps within a complex reflected by the longest record. (i) Streptavidin model: Biotinylated universal probes (a–b–b architecture) were placed at each of the four streptavidin subunits. Initiation at any site, random stepping among proximal sites, and release produced distinct-length records corresponding to visiting one to four subunits, resolved by PAGE. Controls lacking streptavidin generated only single-site records. (ii) Programmed complexes on DNA origami: Start (a–b) and finish (b–d) probes enabled PCR amplification; between them, up to four variable ‘b–b’ units were included to build complexes of size 2–6. Crawlers produced distinct-length records (bands correspond to complex size minus one), read by PAGE. Path deconvolution: A square track with two repeating ‘b–b’ probes carrying distinct barcodes enabled high-throughput sequencing of records to resolve multiple trajectories that produce the same length, revealing path preferences. Multivalent proximity detection in fixed cells: Antibodies (directly or via DNA-conjugated secondaries) targeted alpha-tubulin, beta-tubulin, and EB1 in BS-C-1 cells. Three probe types (‘a–b’, ‘b–c’, ‘c–d’) enabled generation of monovalent records at each target and a trivalent record when all three species were colocalized at microtubule growing ends. Three experimental groups were prepared: (1) growth medium (baseline active growth), (2) nocodazole treatment (microtubule disassembly), and (3) recovery after nocodazole (regrowth). After in situ recording at room temperature with Bst polymerase and dNTPs, supernatants were collected, treated with exonuclease, heat-inactivated, PCR-amplified, and analyzed by gels. To visualize in situ trivalent records, a fluorescent imager strand complementary to the terminal primer (‘d’) was added and samples were imaged by fluorescence microscopy. Core experimental conditions: Isothermal reactions in ThermoPol buffer with Bst (or Bsm for solution assays), typical dNTP 100 µM (solution assays 2 µM), initiating primer ~100 nM, release primer ~10–200 nM; reaction durations typically 1–2 h (origami and cells) or 2 h (solution assays). AFM imaging in fluid tapping mode on mica. PCR used Vent (exo-) or Q5 polymerase; products analyzed by denaturing PAGE, Sanger sequencing, and NGS (Illumina NovaSeq) for path analysis.
Non-destructive, repeated recording: AFM images showed three-probe sites transitioning from mobile individual probes (pre-reaction) to connected structures after crawling (without release primer). Denaturing PAGE confirmed full record length (~100 nt). qPCR demonstrated rapid and catalytic amplification: within ~30 min, record counts exceeded the number of origami templates; ~100-fold excess over templates after 3 h under optimized conditions. Scalability: A ten-probe track with repeated probe arrangements supported extended crawling across multiple steps, confirmed by AFM. Random crawling and counting: Universal probe design enabled initiation, stepping, and release at any site, producing distinct-length records. Streptavidin tetramer produced four discrete record lengths corresponding to visits to 1–4 subunits; in absence of streptavidin, only single-site records formed. Programmed DNA-origami complexes of sizes 2–6 yielded distinct bands where the number of bands equaled complex size minus one, demonstrating tunable counting and facile readout from low-concentration surface samples via PCR amplification. Path deconvolution: In a square track, length-3 and length-4 products arose from two distinct paths each; NGS revealed distributions favoring shorter (non-diagonal) paths, indicating kinetic preference. Multivalent interactions in cells: Trivalent records (α–β–EB1) decreased upon microtubule disassembly and recovered upon regrowth, while monovalent records for alpha-tubulin and EB1 remained comparable across treatments. In a representative gel, trivalent band intensities were 28.4% (nocodazole) and 93.9% (recovered) relative to group (1); pre-amplification record quantities were estimated as −4.7% and −68.9%, respectively, relative to group (1) based on calibration. Across n = 8 experiments, normalized trivalent intensities showed significant group differences (****P ≤ 0.0001, ***P ≤ 0.001). Fluorescence imaging using a ‘d’-complementary imager localized trivalent records near cell peripheries in growing states; group (2) showed flat, low signals (−74% of basal level of group (1)), which recovered in group (3).
The crawler agents autonomously survey molecular landscapes by sequentially copying barcoded information from proximal DNA-labeled targets and concatenating it into records that reflect paths and proximities. The non-destructive, catalytic release-reset cycle enables repeated sampling from intact targets, amplifying weak signals without depleting probes. Demonstrations show that record lengths and contents yield quantitative insights: (i) maximum record length reports the number of subunits in a complex (streptavidin and DNA-origami-based complexes), and (ii) concatenated records capture multivalent proximities, enabling discrimination of cellular states (microtubule growth vs disassembly vs recovery) through trivalent α–β–EB1 colocalization. High-throughput sequencing further resolves trajectory degeneracy and reveals kinetic preferences. These capabilities address limitations of prior destructive or pairwise-limited methods and establish a foundation for scalable mapping of complex molecular interaction networks in situ with nanoscale resolution. Potentially, agent–agent ‘swarm’ interactions could be engineered so that records from one agent guide others, expanding computational and mapping power.
This work introduces molecular robotic ‘crawlers’ that autonomously traverse DNA-labeled targets, non-destructively generate and release sequence records reflecting trajectories, and thereby support quantitative analyses such as counting subunits and detecting multivalent proximities directly in situ. The approach scales to many steps, amplifies signals from low-abundance samples, and can be read out by gels, PCR, and sequencing, with spatial confirmation by microscopy. Future directions include broader labeling modalities (genetic tags, aptamers, nanobodies), direct chromosomal targeting for protein–DNA and 3D genome analyses, engineering swarm-like multi-agent behaviors, and massively parallel labeling and sequencing to build comprehensive nanoscale interaction maps within cells.
Sequencing read ratios do not directly reflect true product ratios due to potential PCR and sequencing biases, which can affect quantitative interpretation of path distributions. Statistical and reproducibility notes indicate that experiments were not randomized or blinded; some datasets (e.g., certain AFM imaging instances) had limited repetitions, though similar patterns were tested multiple times. The multivalent detection studies were performed in fixed cells, and fluorescence/localization metrics and gel quantifications were based on the stated sample sizes and conditions.
Related Publications
Explore these studies to deepen your understanding of the subject.

