Chemistry

A deep learned nanowire segmentation model using synthetic data augmentation

B. Lin, N. Emami, et al.

Discover how Binbin Lin, Nima Emami, David A. Santos, Yuting Luo, Sarbajit Banerjee, and Bai-Xiang Xu harnessed deep learning to revolutionize particle segmentation of V₂O₅ nanowires using synthetic images. Their innovative approach successfully tackles real-world challenges in spectromicroscopy, paving the way for reliability in materials science.... show more

Introduction

The study addresses the challenge of extracting morphological features from complex, information-rich microscopy datasets to understand structure–function relationships in energy storage materials. Conventional deep learning segmentation requires large, annotated datasets, which are difficult to obtain for materials microscopy. The authors propose training an instance segmentation model (Mask R-CNN) entirely on synthetic images that emulate the geometrical appearance and contrast (optical density) of vanadium pentoxide (V2O5) nanowire networks. The goal is to enable accurate, automated segmentation and statistical analysis across different imaging modalities (X-ray ptychography, STXM, SEM), thereby supporting multiscale studies of battery materials where particle size, shape, and morphology critically influence electrochemical behavior.

Literature Review

The paper situates its contribution within advances in deep learning-based segmentation (e.g., Mask R-CNN and variants like Mask Scoring R-CNN, TensorMask) and object detection (e.g., YOLO and YOLACT). Public datasets such as COCO and PASCAL VOC are standard for benchmarking but are mismatched to materials images. Prior materials applications include segmentation of graphene flakes, carbon nanofibers, and diverse electron microscopy images. Data augmentation and synthetic rendering have been explored for grains and nanoparticles to mitigate annotation bottlenecks. The authors note that instance segmentation of complex, overlapping, non-spherical particles remains difficult and that synthetic data can help bridge limited experimental data for training.

Methodology

Datasets and annotation: Three imaging modalities were considered: X-ray ptychography (highest spatial resolution ~6 nm), scanning transmission X-ray microscopy (STXM, ~25–35 nm step size), and scanning electron microscopy (SEM). Human annotations for validation were created with Makesense.ai using polygon masks and exported as JSON. Annotation uncertainty stems from image resolution limits, overlap, and human error.

Synthetic dataset generation: A random nanowire generator (Geodict/GrainGeo) was used to create 3D voxel-based nanowire structures (domain size 512×512×200). For each training sample, the number of particles, length, and shape distributions were specified to emulate nanorod-like morphologies and overlapping networks. Synthetic images were rendered to resemble optical density contrast present in X-ray modalities.

Model and training: Mask R-CNN (Detectron2) with a ResNet-50 Feature Pyramid Network (FPN) backbone pre-trained on COCO was used. Training was performed solely on synthetic datasets; no real images were used for training. Optimization employed stochastic gradient descent with Detectron2 defaults. A hyperparameter study varied epochs (250, 500, 750), synthetic dataset size (250–1000 images), learning rate (0.01–0.03), ROI head batch size per image (128, 256, 512), RPN IoU thresholds (0.6–0.8), and NMS thresholds (0.6–0.8).

Preprocessing: STXM transmission images were converted to absorbance (optical density). Otherwise, images were used as acquired without additional filtering.

Evaluation metrics: Performance was assessed via (i) foreground pixel-wise accuracy (TP/(TP+FP+FN)); (ii) COCO-style AP across IoU thresholds 0.5–0.95 (AP, AP50, AP75, AP for small/medium/large objects); and (iii) agreement of particle statistics (area, aspect ratio, orientation) between predictions and manual annotations. Instance masks also enabled semantic maps for qualitative inspection (TP/FP/FN visualizations). A web application was provided for inference and statistics extraction.

Key Findings

Synthetic images: Across 20 trained models, synthetic test images achieved high AP: up to AP (bbox/segm) ≈ 94.002/90.487 with 500–750 epochs and dataset sizes up to 1000 (Table 1). The model accurately segmented overlapping synthetic nanowires and outperformed Watershed-based methods in overlapped regions.
X-ray ptychography: Best-performing model achieved foreground accuracy ≈ 86.6%; AP (bbox/segm) ≈ 39.145/42.327; AP50 ≈ 64.638/62.519; AP75 ≈ 43.965/39.964. Larger objects were segmented better (AP_l up to ~85.05 for masks) than small/medium ones. The model detected some particles missed by manual annotation (visual FPs), and missed a few due to local noise/low contrast gradients. Over-segmentation within thick particles occurred (extra small instances inside larger ones).
STXM: Despite lower resolution and dense overlaps, accuracies around 73–76% were obtained; AP (bbox/segm) ≈ 27–31/21–22; AP50 ≈ 52–54/49–52; AP75 ≈ 23–29/16–18 (Table 3). Many model ‘FPs’ corresponded to unlabeled small wires overlooked in manual annotation. Predicted statistics (area, aspect ratio, orientation) showed good qualitative and quantitative agreement, with modest KDE shifts due to additional small detections.
SEM (unseen modality): Performance decreased due to fundamentally different contrast: best accuracy ≈ 59.6%; AP (bbox/segm) ≈ 27.61/12.93; AP50 ≈ 51.8/23.13; AP75 ≈ 22.88/5.21 (Table 4). The model still segmented overlaps reasonably by leveraging boundary gradients but struggled with agglomerates and small isolated wires, leading to underestimation in statistics for small sizes/aspect ratios and certain orientations.
Cross-modality robustness: A model trained solely on synthetic OD-like data generalized to real X-ray modalities and showed partial transfer to SEM. Larger instances are consistently segmented more accurately (AP_l > AP_m > AP_s). Predicted particle statistics broadly matched ground truth distributions across modalities.
Practicality: An interactive web app enables inference and automated particle statistics extraction for user-supplied images.

Discussion

Training Mask R-CNN exclusively on synthetic nanowire microstructures that emulate optical density enabled effective instance segmentation on real V2O5 nanowire images across X-ray ptychography and STXM, addressing the scarcity of annotated experimental data. The model successfully handled overlapping networks, a common challenge in particle dispersions, and provided reliable particle-level statistics. The detection of instances overlooked in manual annotations suggests the model can surpass human performance in complex, low-contrast scenes.

Limitations emerged when experimental images deviated from training assumptions: low optical density gradients, non-prismatic morphology, and agglomeration led to over- and under-segmentation, including spurious small instances within thick wires. Transfer to SEM, with fundamentally different contrast mechanisms, showed reduced AP but retained useful segmentation in overlaps, indicating learned boundary cues contribute to modality-agnostic performance.

Overall, the findings validate synthetic data augmentation as a viable route to robust, scalable instance segmentation in materials microscopy, with potential to fuse morphological statistics with chemical maps for cheminformatics and to support real-time monitoring and control in operando studies.

Conclusion

The work demonstrates that a Mask R-CNN model trained solely on carefully generated synthetic nanowire images can accurately perform instance segmentation on complex, overlapping V2O5 nanowire networks imaged by X-ray ptychography and STXM, and shows transferable performance to SEM. The approach yields reliable particle statistics and reduces dependence on extensive human-labeled datasets. A deployable web application makes the method accessible for broader use and data mining.

Future directions include enriching the synthetic generator with non-prismatic and hierarchical morphologies, explicit agglomerate classes, and realistic noise/illumination/background variations to improve generalization, especially to SEM-like modalities. Extending to multiple object classes and integrating chemical contrast will further enhance utility. Real-time implementations with optimized models/GPU hardware could support in situ process control during nanowire growth and battery operation.

Limitations

Manual ground truth is imperfect: dense overlaps and low contrast cause missed labels, affecting metric interpretation (apparent FPs often reflect true but unlabeled instances).
Synthetic training morphologies assumed prismatic wires with limited cross-sectional variation, leading to over-segmentation within thick/heterogeneous experimental wires and reduced robustness to non-prismatic shapes.
Sensitivity to low optical density gradients and noise in overlap regions can cause missed detections (FNs) or spurious small instances (FPs).
Domain shift to SEM (surface-sensitive contrast without optical density) decreases AP, particularly for agglomerates and small isolated particles.
Limited number of particles in some images makes distributional statistics sensitive to small detection errors.

Related Publications

Explore these studies to deepen your understanding of the subject.

Computer Science

Reliability of Supervised Machine Learning Using Synthetic Data in Health Care: Model to Preserve Privacy for Data Sharing

D. Rankin, M. Black, et al.

Medicine and Health

Predictive model of castration resistance in advanced prostate cancer by machine learning using genetic and clinical data: KYUCOG-1401-A study

M. Shiota, S. Nemoto, et al.

Medicine and Health

Design and Analysis of a Deep Learning Ensemble Framework Model for the Detection of COVID-19 and Pneumonia Using Large-Scale CT Scan and X-ray Image Datasets

X. Xue, S. Chinnaperumal, et al.

Transportation

A new model for residential location choice using residential trajectory data

Y. Cui, P. Zhao, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny