Engineering and Technology

Defect detection in atomic-resolution images via unsupervised learning with translational invariance

Y. Guo, S. V. Kalinin, et al.

Explore groundbreaking research by Yueming Guo and colleagues as they unveil a novel unsupervised machine learning technique for defect detection in complex materials using scanning transmission electron microscopy. This method leverages one-class support vector machines to classify defects, eliminating the need for human-labeled data.... show more

Introduction

The study addresses automatic detection of crystallographic defects in atomic-resolution STEM images without requiring labeled training data. The research question is whether an unsupervised method—specifically one-class support vector machines (OCSVM) combined with translationally invariant features—can reliably identify defect-containing image patches as outliers relative to normal unit-cell patches across diverse materials and symmetries. The context is the increasing volume of high-speed, autonomous STEM imaging and the limitations of prior supervised approaches, which require prior knowledge, labels, or simulations and are hard to generalize to complex, low-symmetry crystals. The purpose is to develop a generalizable, label-free workflow that detects defects across 2D and 3D materials and returns their coordinates, thereby enabling efficient analysis of large datasets and facilitating studies of new or complex structures. The importance lies in improving scalability and generality of defect detection, reducing reliance on human labeling, and enabling downstream applications in in situ microscopy and autonomous experimentation.

Literature Review

Prior work predominantly used supervised learning for defect detection and classification in STEM images, trained on human-labeled experimental or simulated datasets. These methods often require substantial prior knowledge, are limited to the structures present in training, and become costly when structures are complex or defect types are diverse. Rule-based methods using graph theory have also been applied, but their efficiency declines as unit-cell complexity increases. One-class SVMs are widely used in industry for outlier detection (e.g., fault diagnosis, cybersecurity) and can model a boundary around the majority class in feature space, making them suitable when defect instances are rare. However, reliable unsupervised ML methods for crystallographic defect detection in STEM images have been lacking. The paper leverages insights from OCSVM literature on parameter sensitivity and introduces a practical, label-free parameter selection strategy, and adopts the Patterson function—historically used in crystallography—as a descriptor to achieve translational invariance in image features.

Methodology

Core approach: Treat defect detection as one-class outlier detection. Segment atomic-resolution STEM images into subimages expected to contain a single unit cell or periodic fragment, map each subimage to a translationally invariant descriptor via the Patterson function, optionally reduce dimensionality with PCA, then train and infer with an OCSVM using a Gaussian kernel. Two segmentation and preprocessing schemes are introduced to ensure a single dominant class (normal unit cells) in the training data.

Scheme one (unit-cell-centered segmentation): Applicable when the projection of the unit cell/repeat unit has a unique intensity extremum (e.g., central hole of a honeycomb or a brightest column). Steps: (1) localize hole/unique site via peak finding and refine by minimizing local summed intensity; (2) crop square subimages centered at these sites; (3) apply the Patterson function to each subimage to remove dependence on translations; (4) stack as input to ML. This scheme keeps runtime independent of the number of atoms in the unit cell.
Scheme two (Bragg-filtered, atom-centered segmentation): Applicable to any unit cell and multi-domain images. Steps: (1) select a Friedel pair of FFT spots (hkl and conjugate) and perform Bragg filtering via inverse FFT (IFFT); (2) rotate the IFFT image so fringes align with axes; (3) segment into squares of side equal to the interplanar spacing of (hkl), each centered on an atom column (different atoms yield different fringe phases); (4) apply the Patterson function to eliminate phase differences, forming a stack per (hkl). Runtime scales with the number of Bragg-filtered images used. Translational invariance via Patterson function: For image intensity ρ(x,y), the 2D Patterson map P(u,v) = ρ(x,y) convolved with ρ(−x,−y) depends only on interatomic vectors, making features invariant to global translations and, in Bragg-filtered data, phase shifts. Periodic boundary conditions are used when computing Patterson maps. This reduces segmentation errors and covariance due to arbitrary shifts, making inputs suitable for OCSVM. Dimensionality reduction: Apply PCA to Patterson maps to reduce dimensionality; in practice, far fewer components are needed post-Patterson mapping (e.g., 9 PCs explain 98% variance versus 36 without Patterson mapping in the MoWTe2 case). OCSVM training and parameter selection: Use LIBSVM via scikit-learn with an RBF (Gaussian) kernel. Two hyperparameters: gamma (γ) controlling kernel width and nu (ν) as an upper bound on outlier fraction. Default γ is set to 1 / (number_of_datapoints × variance of X). ν is chosen at the knee point of the D(ν) curve, where D(ν) is the difference between medians of the decision function for normal versus outliers. Validation includes checking decision function histograms for a Gaussian-like distribution for normal data and an abrupt boundary; testing robustness over γ from 0.5× to 3× default (and in some cases 5–6×) while re-optimizing ν. Demonstration 1: Bilayer Mo0.91W0.09Te2 (2D). Pre-step: Use a Gaussian Mixture Model (unsupervised) on integrated column intensities to classify columns into Mo, W, and vacancies (1–2 Te missing). Replace each column by a standard 2D Gaussian at its coordinates to form an atom-coordinate image, mitigating Z-contrast impact from dopants for one-class modeling. Then apply Scheme one: segment, Patterson-map, PCA, OCSVM parameter selection via D(ν), and detection. Output: mark centers of segments predicted as outliers. Demonstration 2: ZrO2 nanoparticle (3D) with a twin boundary. Apply Scheme two: select a shortest-spacing Friedel pair in FFT; Bragg filter via IFFT; rotate fringes; segment with spacing-sized windows centered on columns; Patterson-map; OCSVM with ν via D(ν). Use the decision function histogram for clustering into two classes corresponding to two domains; map classes back to real space to reveal the twin boundary. Data and code: Datasets at Zenodo (doi:10.5281/zenodo.5520169). Code at GitHub (DrYGuo/Defect-detection-in-atomic-resolution-image-via-unsupervised-learning-with-translational-invariance).

Key Findings

The proposed unsupervised OCSVM workflow with Patterson function features detects diverse defects without labeled training data, returning defect coordinates.
Translation-invariant preprocessing via Patterson mapping substantially reduces segmentation-induced variability: in the MoWTe2 case, 9 principal components explain 98% variance post-Patterson mapping versus 36 PCs without.
Demonstration in bilayer Mo0.91W0.09Te2: After GMM-based separation of dopants and vacancies, OCSVM detects absent or displaced columns (point and line defects). Confusion matrix from Fig. 1d: TP=68, FN=3, FP=44, TN=2346; balanced accuracy = 98.1%. The fraction of defect sites is ~3%. Results are robust across a range of γ (0.5–3× default), and replicate on a larger image by re-optimizing ν via the D(ν) knee.
Demonstration in ZrO2 nanoparticle: Using Bragg-filtered Patterson maps and OCSVM, the method separates two clusters corresponding to twin-related domains and maps a twin boundary consistent with visual inspection.
Parameter selection: Choosing ν at the D(ν) knee and using LIBSVM's default γ after normalization worked well across examples; in some cases γ up to 5–6× default was optimal.
Materials dependence: Separation between normal and outliers is stronger in rigid 3D crystals (clear histogram separation) than in softer 2D materials with widespread lattice distortions, where the boundary is more ambiguous.
Generality and speed: Scheme one runtime is independent of atoms per unit cell; Scheme two runtime scales with the number of Bragg-filtered images. The approach can process single images or stacks and generalizes to multiple domains and different defect types (point, line, planar).

Discussion

The findings show that crystallographic defect detection in STEM images can be framed as a one-class outlier problem where most subimages correspond to a single normal class (unit cell). Using the Patterson function to build translation-invariant descriptors minimizes segmentation errors and concentrates variance into a small number of principal components, enabling effective OCSVM separation with a simple Gaussian kernel. In 2D materials with variable lattice distortions, the boundary between normal and defect classes is less distinct, reflecting inherent ambiguity even for human observers; in rigid 3D materials, separation is clearer. The ν knee-point heuristic and default γ (with normalization) provide a practical, label-free parameter selection strategy that worked across cases. The approach scales to multi-domain images (via Scheme two) and can associate Bragg reflections with specific domains, detecting features like twin boundaries and dislocations. The method’s ability to detect defects without labeled training data and return coordinates makes it valuable for rapid screening, in situ studies, autonomous scan control, and preprocessing for structure reconstruction by filtering damaged unit cells. While OCSVM here is a binary normal/defect discriminator, the same preprocessed feature space allows further unsupervised clustering to subtype defects or integration with other ML models (e.g., VAEs) for denoising or strain analysis.

Conclusion

This work introduces an unsupervised, label-free framework for detecting crystallographic defects in atomic-resolution STEM images by combining translation-invariant Patterson-map descriptors with OCSVM. Two complementary segmentation schemes address single-domain unit-cell-centered data (Scheme one) and general, multi-domain Bragg-filtered data (Scheme two). The method reliably detects point and line defects in 2D Mo0.91W0.09Te2 and delineates a twin boundary in 3D ZrO2, achieving high accuracy (balanced accuracy 98.1% in the 2D demonstration) with practical, label-free hyperparameter selection. Contributions include: (1) robust translation-invariant preprocessing that reduces dimensionality and segmentation sensitivity; (2) generalizable segmentation workflows adaptable to diverse symmetries and domains; (3) open data and code enabling adoption. Future directions include: extending to automatic categorization of defect types via clustering, integrating with autonomous scan control for real-time feedback, adapting the Bragg-filtered Patterson inputs for denoising and strain mapping with models like VAEs, and exploring customized kernels or hybrid models to handle more complex nonlinearities.

Limitations

Current OCSVM framework provides binary discrimination (defect vs. non-defect) and does not classify defect subtypes; additional clustering or supervised steps are needed for categorization.
Performance depends on appropriate hyperparameter selection (γ, ν), though practical heuristics are provided; overfitting can occur if parameters are reused across datasets without re-optimization.
Scheme one requires a unique intensity extremum per unit cell (limits applicability to certain symmetries); Scheme two increases runtime proportional to the number of Bragg-filtered images used.
Dopants and strong Z-contrast can confound one-class assumptions; preprocessing to exclude dopants (e.g., via GMM) is needed.
Ambiguity increases in soft materials with widespread lattice distortion, making separation between normal and defect classes less distinct.
Patterson mapping assumes periodic boundary conditions; inaccuracies in segmentation or noise can still affect results, although mitigated by the descriptor and PCA.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Efficient Detection of Stigmatizing Language in Electronic Health Records via In-Context Learning: A Comparative Analysis and Validation Study

H. Chen, M. Alfred, et al.

Medicine and Health

Identification of four biotypes in temporal lobe epilepsy via machine learning on brain images

Y. Jiang, W. Li, et al.

Engineering and Technology

Topographic design in wearable MXene sensors with in-sensor machine learning for full-body avatar reconstruction

H. Yang, J. Li, et al.

Engineering and Technology

In-sensor human gait analysis with machine learning in a wearable microfabricated accelerometer

G. Dion, A. Tessier-poirier, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 22+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny