Introduction
Atom manipulation with STM, first demonstrated in the 1990s, is crucial for creating atomically precise structures to research exotic quantum states and miniaturize computational devices. Artificial structures on metal surfaces allow tuning of electronic and spin interactions, leading to designer quantum states of matter. This technique has expanded to various platforms, including superconductors, 2D materials, semiconductors, and topological insulators, enabling the creation of novel quantum phenomena. Atom manipulation is also used to build atomic-scale computational devices, such as quantum and classical logic gates, memory, and Boltzmann machines.
Precise adatom arrangement requires tuning tip-adatom interactions to overcome energetic barriers. These interactions are controlled via tip position, bias, and tunneling conductance, parameters that are not known a priori. Incorrect parameter selection can lead to imprecise control, tip crashes, and unintentional rearrangement of neighboring adatoms. Spontaneous tip apex changes further complicate the process, requiring human intervention to find new parameters or reshape the tip apex.
Deep reinforcement learning (DRL) has emerged as a powerful method for solving nonlinear stochastic control problems. Unlike standard RL, DRL agents, based on deep neural networks, learn through trial and error in dynamic environments. State-of-the-art DRL algorithms offer improved data efficiency and stability, making them suitable for real-world applications. Machine learning has been integrated into scanning probe microscopy to address various issues, and DRL has been used to automate tip preparation and vertical molecular manipulation. This work explores the use of DRL to achieve atomic precision in atom manipulation.
Literature Review
The literature review section highlights the existing techniques and challenges in atomic-scale manipulation using scanning tunneling microscopy (STM). It discusses the limitations of traditional methods in handling unpredictable parameters, tip apex changes, and complex interactions. The review also emphasizes the recent advancements in deep reinforcement learning (DRL) and its potential applications in addressing the challenges of atomic-scale manipulation. Several studies on the use of machine learning in scanning probe microscopy and the successful application of DRL in automating related tasks are cited to support the rationale behind using DRL for precise atom manipulation.
Methodology
The researchers formulated the atom manipulation control problem as a reinforcement learning (RL) problem, specifically a Markov Decision Process (MDP). The DRL agent's goal is to move an adatom to a target position precisely and efficiently. Each episode starts with a random target position, and the agent has a limited number (N=5) of manipulation attempts. The state (st) at each time step (t) consists of the XY-coordinates of the target and current adatom positions, extracted from STM images. The agent selects an action (at) based on its current policy (π), which is a six-dimensional vector including bias (V), tunneling conductance (G), and start and end XY-coordinates of the tip movement.
A combined convolutional neural network (CNN) and empirical formula method classifies whether the adatom has moved after each action. If movement is detected, a scan updates the adatom position; otherwise, the scan is often skipped to save time. The agent receives a reward (rt) based on the precision and efficiency of the atom placement; the reward function is designed to encourage accurate placement and efficient manipulation. The experience (st, at, rt, st+1) is stored in a replay memory buffer.
The soft actor-critic (SAC) algorithm, known for its robustness and sample efficiency, is used to train the agent. The policy (actor) and Q-functions are represented by multilayer perceptrons. The algorithm uses stochastic gradient descent, employing experiences from the replay buffer. Two data efficiency techniques are incorporated: Hindsight Experience Replay (HER), allowing learning from experiences with unintended goals, and Emphasizing Recent Experience sampling, prioritizing recent experiences to adapt to environmental changes (like tip apex changes).
The atom movement classification algorithm combines a CNN classifier (trained on tunneling current traces) and an empirical formula based on current trace spikes to predict adatom movement, reducing the need for frequent STM scans. For adatom placement probability estimation, they numerically computed the probability P(Xadatom = Xnearest | ε) considering only fcc or both fcc and hcp sites. The Hungarian algorithm and the rapidly-exploring random tree (RRT) algorithm are used for adatom assignment and path planning during the construction of the artificial lattice. The hyperparameters for the SAC algorithm, including optimizer, learning rate, and replay buffer size are specified. The emphasizing recent experience replay technique is used to improve the agent's adaptability to changes in the environment.
Key Findings
The DRL agent significantly improved its performance during training, minimizing manipulation error and achieving a 100% success rate over 100 episodes after approximately 2000 training episodes (6000 manipulations). This is comparable to the number of manipulations in previous large-scale atom-assembly experiments. The agent’s performance is robust against tip apex changes, recovering within a few hundred episodes after a tip change. The mean manipulation error was 0.089 nm, substantially lower than one lattice constant (0.288 nm). Probabilistic estimation, based on the geometry of fcc and hcp sites, suggests that the atoms were placed at the nearest site 61-93% of the time, depending on whether both fcc and hcp sites are considered.
A comparison with a baseline algorithm using fixed manipulation parameters revealed that the DRL agent is more robust against tip changes. While the baseline's performance varied significantly under different tip conditions, the DRL agent maintained relatively good performance and eventually reached success rates >95% after continued training under new conditions. The distribution of manipulation-induced adatom movements indicated that Ag adatoms occupied both fcc and hcp sites. The lattice orientation from adatom movements was consistent with atomically resolved scanning. Finally, the trained DRL agent autonomously constructed a 42-atom kagome lattice with atomic precision, demonstrating the effectiveness of the integrated system combining the DRL agent with the Hungarian algorithm for adatom assignment and the RRT algorithm for path planning.
Discussion
The successful training of a DRL model to manipulate atoms with atomic precision highlights the potential of AI in tackling atomic-level challenges. The method provides a robust and efficient technique for automating the creation of artificial structures and atomic-scale computational devices. The DRL approach's ability to learn directly from environmental interactions, without needing a pre-existing model, makes it promising for discovering stable manipulation parameters in novel systems. This study represents a significant step toward integrating artificial intelligence into nanofabrication processes.
Conclusion
This research demonstrates the successful application of deep reinforcement learning (DRL) for precise atom manipulation. By combining state-of-the-art RL algorithms and a carefully designed RL framework, the DRL agent achieves atomic precision with high data efficiency and robustness to tip apex changes. This work represents a milestone in using AI for nanofabrication automation and opens new avenues for controlling increasingly complex atomic-scale experiments. Future research could explore the application of DRL to other atom types and substrates, as well as the development of more sophisticated reward functions and path planning algorithms.
Limitations
While the DRL agent demonstrated excellent performance, the study's scope is limited to Ag adatoms on Ag(111) surfaces. The generalizability of the method to other atom-surface combinations needs further investigation. The success of the autonomous construction of the kagome lattice relies on the accuracy of the atom movement classification algorithm and the efficiency of the path planning algorithms. Improvements in these algorithms could further enhance the speed and reliability of the atomic assembly process. The current reward function might not be optimal and further exploration of reward function design could improve the efficiency of the learning process.
Related Publications
Explore these studies to deepen your understanding of the subject.