Introduction
Advances in materials synthesis have enabled well-behaved nanostructures, where atomic structure and deformations dictate their properties and performance in applications like energy harvesting and electronics. Techniques like high-resolution transmission electron microscopy (HRTEM) and annular dark field scanning transmission electron microscopy (ADF-STEM) reveal local atomic structure, but their field of view is limited. Nanobeam electron diffraction, capable of mapping larger sample areas, has gained attention. However, conventional detectors were too slow for detailed structural information across the entire sample. The development of fast direct electron detectors, particularly the electron microscope pixel array detector (EMPAD), now allows the collection of momentum-resolved nanobeam diffraction patterns at each scanning position in a STEM experiment, generating four-dimensional (4D) data. While EMPAD-enabled 4D-STEM offers sub-picometer precision strain mapping across micrometer scales, current approaches heavily rely on prior knowledge of the sample structure. This limits the generalizability to materials with unexpected lattice deformations, which significantly influence material properties and device performance. Machine learning, particularly unsupervised learning, offers a promising solution for analyzing complex patterns in large datasets without training data. This study uses a divisive hierarchical unsupervised clustering architecture to rapidly and semi-automatically analyze 4D-STEM data and map features based on intrinsic characteristics and similarity.
Literature Review
The introduction extensively reviews existing techniques for analyzing nanomaterial structure and deformations. It highlights the limitations of conventional HRTEM and ADF-STEM in terms of field of view, and discusses the advantages of nanobeam electron diffraction. The limitations of previous nanobeam diffraction approaches relying on prior sample knowledge are pointed out. The emergence of machine learning techniques in microscopy, particularly unsupervised learning for tasks like identifying stacking order and twin boundaries, is mentioned as a motivation for the current work.
Methodology
The methodology involves a three-step process applied to the 4D-STEM datasets: 1) **Preprocessing:** This includes aligning diffraction patterns to correct for beam tilt during scanning (using center of mass alignment), and masking to remove low-angle scattering and noise. A ring mask is applied, and features are selected based on the standard deviation of intensities. 2) **Hierarchical Clustering:** A divisive hierarchical clustering approach is used, employing K-means clustering in each round. The number of clusters (K) in each round is determined using the elbow method. This multi-round approach allows for the identification of features at various length scales. 3) **Visualization:** Clustering results are visualized in real-space color-coded maps, with each color representing a structural feature. Additionally, the high-dimensional diffraction patterns are projected onto a lower-dimensional manifold (using UMAP) for visualization of cluster distribution and variance. The K-means algorithm's parameters and the choice of K-means over other clustering methods (based on a comparison presented in Supplementary Information) are justified.
Key Findings
The developed method is applied to three different materials: 1) **WS2-WSe2 lateral heterojunction:** The method successfully separates WS2 and WSe2 regions, identifying the interface with high accuracy (99.9%). Subsequent clustering rounds reveal rotational periodicity and graded interfaces in WS2, and continuous lattice tilts (ripples) in WSe2, consistent with literature. 2) **WS2-WSe2 superlattices:** The approach effectively identifies different flakes, lattice differences between WS2 and WSe2, directional uniaxial strain, and minor ripples in WSe2 stripes, demonstrating its sensitivity to subtle deformations. 3) **Silver nanoprisms:** The method reveals bending contours in a silver nanoprism, distinguishing the sample from the amorphous supporting film. Further clustering differentiates the two sides of the bending contour, consistent with the expected structure and confirmed by virtual dark-field (DF) images. The paper also explores clustering real-space images generated from diffraction pattern intensities at single momentum-space coordinates, allowing for virtual BF and DF imaging, which is demonstrated on WS2-WSe2 superlattices and InGaP/GaAs cross-sectional samples. This shows the ability to identify features based on differing lattice orientations, strain, and even twin domains. The accuracy of the clustering approach for discrete and continuous features is discussed, along with the consistency of K-means clustering through multiple runs.
Discussion
The results demonstrate the effectiveness of the proposed data-driven approach for uncovering material deformations in diverse nanomaterials. The hierarchical clustering strategy, combined with visualization techniques, allows for the identification of both discrete and continuous features at multiple length scales, revealing detailed structural information not readily apparent in conventional ADF-STEM images. The use of unsupervised machine learning avoids the need for prior knowledge of the sample structure, making the method broadly applicable to a variety of materials and unexpected deformations. The high accuracy in identifying discrete features (like material composition) and the reasonable accuracy in identifying continuous features (like strain and rotation) are discussed. The method's ability to rapidly analyze large 4D datasets makes it a valuable tool for accelerating materials discovery and characterization.
Conclusion
This work presents a novel, data-driven method for analyzing 4D-STEM datasets using divisive hierarchical unsupervised machine learning. The method effectively reveals various types of material deformations in diverse materials systems. This approach provides a rapid initial analysis of 4D-STEM data, facilitating the discovery of unexpected deformations. While further processing may be needed for precise quantitative mapping of continuous deformations, the method's speed and ability to highlight areas of interest make it a valuable tool for materials science research. Future work could explore the integration of this method with advanced mapping techniques for fully autonomous analysis of subtle lattice deformations, expanding its application to other imaging techniques and broader material systems.
Limitations
The accuracy of the clustering method is higher for discrete features than for continuous features. While the method effectively reveals the presence of continuous deformations, further processing would be required for precise quantitative mapping. The choice of K-means clustering and the associated parameters might impact the results, though this is mitigated by the elbow method for selecting the number of clusters and a justification for the chosen method is provided. The preprocessing steps, such as the standard deviation threshold for selecting diffraction regions, involve some level of user input, which might not be fully automated.
Related Publications
Explore these studies to deepen your understanding of the subject.