Engineering and Technology
Artificial-intelligence-driven scanning probe microscopy
A. Krull, P. Hirsch, et al.
Discover DeepSPM, an innovative AI framework revolutionizing scanning probe microscopy by enabling autonomous operations. This groundbreaking research by A. Krull, P. Hirsch, C. Rother, A. Schiffrin, and C. Krull showcases how machine learning can optimize surface imaging with atomic precision, enhancing data acquisition even in demanding conditions.
~3 min • Beginner • English
Introduction
The study addresses the fundamental challenge that, while SPM enables atomic-scale imaging and manipulation, effective operation typically requires continuous expert supervision due to varying probe states and sample conditions that degrade image quality and yield. Human operators currently select regions, assess image quality, and perform heuristic, trial-and-error probe conditioning steps. Prior attempts to improve efficiency via probe characterization or scripted automation remain limited in generality and do not manage tip quality robustly. The research question is whether an artificial intelligence system can autonomously operate an SPM end-to-end—selecting imaging regions, assessing image quality, diagnosing issues, and conditioning the probe—thereby maintaining high-quality data acquisition over long periods without human intervention. The paper proposes DeepSPM, combining algorithmic region selection, a supervised CNN classifier for probe quality assessment, and a deep reinforcement learning agent for intelligent probe conditioning, to achieve reliable autonomous SPM operation.
Literature Review
The paper reviews approaches that link probe morphology to image quality through analytical simulations and inverse imaging, and probe characterization/manipulation techniques (e.g., field ion microscopy) that can mitigate the need for heuristic conditioning but are difficult to generalize or scale for large datasets. Automation efforts include scripted SPM operation and autonomous region selection/measurement in AFM, but these systems are constrained to specific conditions and do not actively manage probe quality. Supervised machine learning has assisted in detecting and repairing specific probe defects in specialized systems (e.g., hydrogen-terminated silicon) using trained CNNs and known conditioning protocols, and separate work estimated imaging quality from partial scan lines. However, fully autonomous operation under variable probe defects and without fixed conditioning protocols has not been demonstrated. Reinforcement learning offers a path to learn decision strategies without explicit labels or rules, motivating its use for adaptive probe conditioning in SPM.
Methodology
DeepSPM is built as a closed-loop autonomous controller for SPM. Workflow: (1) Approach: The system approaches the probe to detect a measurement signal (e.g., tunneling current) and handles lost contact or crashes by monitoring fine z-piezo extension and adjusting coarse approach/retraction accordingly. (2) Region selection: It maintains a binary map of scanned/forbidden regions, algorithmically identifies and avoids overly rough or contaminated areas based on apparent height distributions, and chooses the next imaging region by minimizing a combined Manhattan distance to the previous region and Euclidean distance to the approach area center (α=1). It also blocks zones around detected rough regions and past conditioning locations, with radii scaled to event frequency or action type. (3) Image acquisition: For each region, it discards an initial creep-distorted image and records a second image under standard parameters. (4) Assessment: Preprocessing removes background plane with RANSAC fitting, normalizes heights, and clips values. It first checks for roughness, lost contact, or crash, and if none are detected, a classifier CNN estimates the probability of a "good probe"; images are stored if good. If the probe is bad for ten consecutive images, a conditioning episode is initiated. Probe conditioning via RL: A deep RL agent (double DQN) selects one of 12 expert-defined actions (voltage pulses or dips) to restore tip quality. An action CNN (VGG-like: 12 conv layers with 3×3 kernels and 64–512 feature maps, max-pooling after first two blocks; two FC layers with 4096 units; ReLU activations; batch norm; dropout 0.5; Xavier initialization) takes a single 64×64 STM image and outputs Q-values for the 12 actions. Weights are initialized from the trained classifier CNN (except the output layer). Actions are executed at the center of the largest clean Ag(100) square area within the image, determined from a binary map and subject to per-action area requirements. Reward scheme: +1 per executed conditioning action and +10 for terminating an episode (first image classified as good), encouraging short sequences. The agent uses ε-greedy exploration with experience replay (buffer size 15,000; prefill with 500 steps; batch size 64; ADAM optimizer; learning rate 5×10⁻4; discount γ=0.95). ε decays linearly from 1.0 to 0.05 over 500 steps, then remains 0.05. Training proceeds online during interaction; during testing, the agent continues learning with ε=0.05 and is compared to random action selection in an interleaved scheme. Classifier CNN: Same backbone, single sigmoid output estimating P(good probe), threshold 0.9. Trained on 7589 labeled STM images (64×64) of MgPc/Ag(100) with ADAM (lr=10⁻3), cross-entropy with L2 weight decay 5×10⁻5, class weighting (good=25% weighted by 8), and data augmentation (random flips). Dataset split 76/24 train/test and available online. Experimental setup: Low-temperature UHV STM (Createc) operating at 4.6 K with Pt/Ir tip; constant-current topography at Vbias=1 V, I=25 pA, scan speed 80 nm s⁻1; fine scanner range ±425 nm at 4.6 K. Samples: MgPc sublimed at 650 K onto clean Ag(100) prepared by Ar+ sputtering and annealing; base pressure <1×10⁻9 mbar. Autonomous run procedures include initial z-extension to −80% of range, 120 s wait for drift/creep stabilization, and macro moves to new approach areas when the fine scan area is filled or next-region distance exceeds 500 nm.
Key Findings
- The RL-based conditioning agent outperforms random action selection: during testing with active post-episode damage, the trained agent required on average about 28% fewer conditioning steps per episode than random selection (189 RL episodes vs 184 random episodes). - Long-term autonomy: DeepSPM ran for 86 hours continuously, scanning 1.2 µm² of sample area and recording over 16,000 images. It handled 2 lost-contact events, identified and avoided 1075 overly rough regions, and initiated 117 probe-conditioning episodes. - Data quality and decision validity: Manual inspection showed that 87% of images labeled "good" by DeepSPM were free of defects/artifacts, and approximately 86% of initiated conditioning episodes were judged necessary. - Conditioning efficiency during autonomous operation: The RL agent achieved a mean episode length of 4.93 actions, about 34% shorter than in the testing scenario, attributed to less severe post-episode damage during normal operation.
Discussion
Findings demonstrate that an AI system can autonomously conduct SPM under varying conditions by selecting imaging regions, assessing image quality, and conditioning the probe as needed. Despite the stochastic nature of conditioning actions and the inability to directly control or infer the atomistic tip structure from single images, the RL agent learns action policies that yield better-than-random improvements, indicating actionable information in the image features and decision history. The process exhibits effective memory since the probe state depends on prior action-image sequences; continuous online training enables the agent to track evolving tip states and maintain performance. The approach eliminates the need for pre-defined conditioning protocols and expert heuristics by deriving strategies from interaction, freeing researcher time and moving SPM towards turnkey, non-expert operation. The framework is broadly applicable across SPM modalities and samples, and can be extended to spectroscopy (STS, KPFM) by incorporating spectroscopic quality criteria. Autonomous SPM also enables high-throughput data collection and scalable atomically precise nanofabrication workflows.
Conclusion
The paper introduces DeepSPM, an AI-driven autonomous SPM framework integrating algorithmic region selection, a supervised CNN for probe quality assessment, and a deep RL agent for intelligent probe conditioning. It demonstrates robust, multi-day autonomous STM operation with high usable data yield and efficient, learned conditioning that outperforms random strategies. The system generalizes in principle to diverse SPM techniques and setups given suitable training datasets, with source code to be released publicly. Future work includes integrating semi-automatic ML methods for early detection of adverse imaging conditions or identifying regions of interest, expanding to spectroscopy-driven criteria, enriching action sets to better control tip states, and broadening training across different samples and probe materials to enhance generalization.
Limitations
- Conditioning actions are probabilistic and do not directly control the atomic-scale tip structure; episode lengths vary and optimality is bounded by available actions. - Single-image inputs do not reveal full atomistic tip morphology, making the process history-dependent and necessitating continuous online learning. - Classifier training is specific to MgPc/Ag(100) STM images with a metallic tip; generalization to other materials, probes, and SPM modes requires new labeled datasets. - The classifier distinguishes only good vs bad probe, not specific defect types, potentially limiting targeted conditioning strategies. - Autonomous metrics depend on the chosen thresholds (e.g., ten consecutive bad images to trigger conditioning) and action-specific exclusion zones, which may require tuning for different setups.
Related Publications
Explore these studies to deepen your understanding of the subject.

