
Medicine and Health
Combining machine learning and nanopore construction creates an artificial intelligence nanopore for coronavirus detection
M. Taniguchi, S. Minami, et al.
This groundbreaking research by Masateru Taniguchi and colleagues introduces a cutting-edge method for rapid detection of coronaviruses using nanopores and AI, achieving remarkable sensitivity and specificity in just 5 minutes—all without RNA extraction.
~3 min • Beginner • English
Introduction
The study addresses the need for rapid, high-throughput viral detection methods that avoid the time-consuming RNA extraction required by RT-PCR. Coronaviruses, including SARS-CoV, MERS-CoV, and SARS-CoV-2, cause severe respiratory diseases and pandemics, underscoring the importance of fast diagnostics. Solid-state nanopores can capture biophysical signatures of particles (e.g., viruses) as transient modulations of ionic current, encoding information about size, structure, and surface charge. Prior laboratory demonstrations showed that AI can classify single-virus waveforms without genome extraction, but translation to reliable diagnostics is limited by constraints in nanopore device manufacturing consistency and dedicated measurement/flow systems. This work proposes and evaluates an AI-assisted nanopore platform designed for robust, accurate coronavirus detection, including direct testing of clinical saliva without RNA extraction.
Literature Review
Prior work established solid-state nanopores as versatile sensors for nucleic acids, proteins, viruses, and bacteria, with signal characteristics influenced by pore geometry, electrical and fluidic properties (e.g., Branton 2008; Dekker 2007; Howorka & Siwy 2009; Wanunu 2012). Studies have demonstrated discrimination of biomolecules and particles, including viruses and bacteria, from ionic current waveforms and the application of machine learning to enhance classification performance (e.g., Arima 2018; Tsutsui 2017; Meyer 2020). However, clinical deployment has been hindered by variability in nanopore fabrication, measurement electronics, and fluidics affecting signal reproducibility. The present study builds on these foundations by mass-fabricating precise nanopore modules, integrating stable electrodes and portable electronics, and employing machine learning to classify coronavirus signals directly from waveforms in cultured samples and clinical saliva.
Methodology
Platform and device: The AI-nanopore platform comprises (i) a nanopore module (25×25×5 mm) integrating a silicon chip (5×5×0.5 mm) with a 50-nm SiN membrane containing approximately 300-nm pores, (ii) hydrophilic crossbar fluidic channels on both sides (cis/trans) with integrated Ag/AgCl electrodes, and (iii) a portable high-speed current measurement instrument (nanoSCOUTER) interfaced to server-hosted AI software (Aipore-ONE). Silicon chips were mass-produced on 12-inch wafers and diced, yielding pores with diameter error ±10 nm and 90% yield. The pore diameter can be tuned to target particle size.
Measurement setup: Specimens (15 µl) are pipetted into cis and buffer into trans channels. Bias voltages of ±0.1 V are applied. For performance characterization, monodisperse polystyrene nanoparticles (200 and 220 nm) in 1×PBS were measured at 0.1 V. For viruses, cultured HCoV-229E was measured at −0.1 V; translocation was verified by RT-PCR of the trans channel after acquiring ~1200 waveforms. Concentration dependence was examined by varying HCoV-229E titer; electrophoresis was shown to drive efficient passage compared to diffusion.
Clinical specimens: Saliva specimens (PCR-positive and PCR-negative for SARS-CoV-2) were filtered through a 0.45-µm membrane and diluted with 1×PBS. Unlike cultured HCoV-229E, SARS-CoV-2 in saliva produced more events at +0.1 V than −0.1 V, consistent with differing surface charge in clinical matrices; virus passage at +0.1 V was confirmed by RT-PCR of the trans channel after ~3000 waveforms.
Signal processing and feature engineering: An in-house signal extraction tool automatically identifies translocation pulses from ionic current-time traces. From each waveform, features are computed: peak current amplitude (Ip), duration (td/ta), current vectors (I1..I10) and time vectors (t1..t10) obtained by decile segmentation along current and time, and combinations of these features (e.g., Ip+td). Features from all events are merged to train classifiers.
Machine learning: For nanoparticles and cultured viruses, supervised learning is used to maximize F-value, computed from the confusion matrix. For clinical saliva, a positive–unlabeled (PU) learning assembly is applied, assuming noise waveforms are shared between PCR-negative and PCR-positive samples; the model learns to separate SARS-CoV-2 waveforms from noise using PCR-negative data as noise and PCR-positive data as a mixture. A random forest classifier provided the best single-waveform performance in clinical data. Confidence scores per waveform are derived from classifier outputs and F-values. A specimen-level positive ratio (fraction of waveforms classified as positive) is computed; comparison with a learned threshold yields specimen diagnosis. Performance metrics (sensitivity, specificity) are computed at measurement durations of 1–5 minutes using confusion matrices.
Biosafety and instruments: HCoV-229E was handled in BSL-2; SARS-CoV, SARS-CoV-2, and MERS-CoV in BSL-3. NanoSCOUTER settings: bias 0.1 V (unless noted), transimpedance gain 10^7, bandwidth ~260 kHz, sampling 1 MHz.
PCR: Standard RT-qPCR protocols quantified viral RNA (HCoV-229E RdRp-target; SARS-CoV-2 N1/N2 targets) for validation and viral load estimation.
Cohorts: Training used 40 PCR-positive and 40 PCR-negative saliva specimens (stored refrigerated). Independent testing used 50 PCR-positive and 50 PCR-negative specimens. Numbers of nanopore modules and waveforms are detailed in Supplementary Tables.
Key Findings
- Device reproducibility: Mass-fabricated nanopore modules achieved ±10 nm diameter accuracy and ~90% yield; integrated electrodes enabled stable, reproducible current measurements.
- Nanoparticle discrimination: Using single ionic current-time waveforms and combined features, the classifier distinguished 200 nm vs 220 nm polystyrene nanoparticles with 97% accuracy per event; two events would reach ≥99.9% classification accuracy.
- Cultured coronavirus translocation: HCoV-229E at 100 pfu/µl produced ~14.2 pulses/min at −0.1 V; RT-PCR of the trans channel confirmed electrophoresis-driven virus passage through the nanopore. Diffusion without applied voltage for 6 h yielded no detectable virus in the trans channel.
- Multi-virus discrimination (cultured): The platform accurately identified closely sized coronaviruses (HCoV-229E, SARS-CoV, MERS-CoV, SARS-CoV-2) from single-event waveforms using machine learning. Discrimination between MERS-CoV and HCoV-229E was the most challenging among pairs.
- Clinical saliva diagnosis (training evaluation): With 40 PCR-positive and 40 PCR-negative saliva specimens, single-waveform discrimination achieved F=1.00, yielding 100% sensitivity and 100% specificity across 1–5 min measurement durations.
- Clinical saliva diagnosis (independent test): On 50 PCR-positive and 50 PCR-negative specimens not used in training, sensitivity and specificity improved with measurement time, reaching 90% sensitivity and 96% specificity at 5 minutes. True positives showed sustained high-confidence positive events and high positive waveform ratios; true negatives showed low positive confidence and low ratios; false positives/negatives exhibited intermediate patterns.
- Generalizability: Control analyses indicated device-to-device variations within the ±10 nm fabrication tolerance did not impact classification F-values. Preliminary cross-virus extension showed high discrimination between cultured SARS-CoV-2 and influenza A (H1N1) with F=0.90, suggesting adaptability to differential diagnosis.
Discussion
The AI-nanopore platform directly detects intact viruses from ionic current waveforms, bypassing RNA extraction and enabling rapid, high-throughput diagnostics. Machine learning extracts rich temporal and amplitude features from single events to differentiate particles and viruses, addressing overlaps in simple metrics (Ip, td). Tight control of nanopore fabrication and integrated measurement electronics mitigate device-induced variability, supporting robust classification across modules. In clinical saliva, a PU-learning-based approach effectively separates SARS-CoV-2 signals from shared noise, enabling specimen-level diagnosis within minutes. Independent testing demonstrated 90% sensitivity and 96% specificity in 5 minutes, suitable for screening contexts where high throughput is essential. The platform’s flexibility—by retraining on new targets (e.g., influenza A vs SARS-CoV-2)—positions it as a versatile diagnostic tool for respiratory viruses with overlapping clinical presentations. The findings substantiate that electrophoretic translocation at appropriate polarity enhances detection and that waveform-derived confidence dynamics correlate with true infection status.
Conclusion
This work introduces an artificially intelligent nanopore platform that combines mass-fabricated solid-state nanopores, portable precision current measurement, and machine learning to rapidly detect coronaviruses without RNA extraction. It accurately discriminates closely sized nanoparticles and multiple coronavirus species from single-event waveforms and diagnoses SARS-CoV-2 in clinical saliva with 90% sensitivity and 96% specificity in 5 minutes on an independent test set. The approach is scalable, robust to device variations within fabrication tolerance, and adaptable to other respiratory viruses via retraining (e.g., demonstrated SARS-CoV-2 vs influenza A discrimination). Future work should expand clinical cohorts, broaden pathogen panels for multiplexed differential diagnosis, optimize pore size and surface chemistry for diverse specimens, and integrate on-device analytics for point-of-care deployment.
Limitations
- Overfitting risk: Perfect performance in the training evaluation (100% sensitivity/specificity) decreased to 90%/96% on independent testing, indicating potential overfitting and the need for larger, more diverse training sets.
- Biosafety and experimental constraints: Simultaneous measurement of multiple pathogenic coronaviruses on the same physical pore was not feasible due to BSL-level segregation and cross-contamination concerns, limiting direct assessment of device-specific confounding across all viruses.
- Challenging class separations: Discrimination between certain coronavirus pairs (e.g., MERS-CoV vs HCoV-229E) was more difficult, suggesting overlapping waveform features and the need for enhanced feature sets or tailored pore conditions.
- Cultured-virus detection limit characterization: High-titer culturing limitations (e.g., HCoV-229E threshold at 250 pfu/µl) constrained detailed limit-of-detection assessments across a broad concentration range.
- Matrix effects: Clinical saliva required filtration and dilution; differing surface charges between cultured and clinical virus affected optimal bias polarity, implying that specimen matrix and preparation can influence performance.
- Assumptions in PU learning: The approach assumes noise waveforms are shared between PCR-negative and PCR-positive specimens; deviations from this assumption could impact classifier calibration.
Related Publications
Explore these studies to deepen your understanding of the subject.