logo
ResearchBunny Logo
AutoPhaseNN: unsupervised physics-aware deep learning of 3D nanoscale Bragg coherent diffraction imaging

Physics

AutoPhaseNN: unsupervised physics-aware deep learning of 3D nanoscale Bragg coherent diffraction imaging

Y. Yao, H. Chan, et al.

Discover AutoPhaseNN, a groundbreaking deep learning framework developed by Yudong Yao, Henry Chan, Subramanian Sankaranarayanan, Prasanna Balaprakash, Ross J. Harder, and Mathew J. Cherukara. This innovative approach solves the challenging phase retrieval problem in 3D X-ray Bragg coherent diffraction imaging, achieving a remarkable 100x speedup over traditional methods while preserving high image quality—all without the need for labeled data.

00:00
00:00
~3 min • Beginner • English
Introduction
Phase retrieval—recovering phase information from measured diffraction intensities—is central to multiple imaging modalities, including X-ray BCDI, ptychography, electron ptychography, LTEM, super-resolution optical imaging, and astronomy. In BCDI, reconstructing the complex real-space image (amplitude and phase encoding strain) from far-field Bragg diffraction intensities requires iterative algorithms, which are slow and hinder real-time feedback, especially for in situ/operando experiments. Prior deep learning approaches have accelerated phase retrieval but rely on supervised training with large labeled datasets, either simulated (often not fully representative and artifact-free) or experimentally reconstructed via computationally heavy iterative methods. Recent unpaired or physics-informed methods still require ground-truth images or underperform without strong priors. This work proposes AutoPhaseNN, an unsupervised, physics-aware CNN that incorporates X-ray scattering physics during training so it learns to directly invert 3D BCDI diffraction to real-space amplitude and phase without ever seeing ground-truth images. The goal is to achieve comparable reconstruction quality to iterative phase retrieval with orders-of-magnitude speedup and to provide a learned prior that further accelerates and improves conventional refinement.
Literature Review
The paper surveys deep learning solutions to inverse problems in MRI, denoising, and super-resolution, and specifically to phase retrieval across holography, lensless imaging, X-ray and electron ptychography, and BCDI. Most prior models are supervised, needing paired diffraction and ground-truth complex images, limiting practicality due to simulation-to-experiment gaps and the cost of generating labels via iterative reconstructions. Unpaired GAN-based approaches (e.g., PhaseGAN) still require ground-truth image distributions, and recent physics-informed neural networks for XFEL pulse recovery showed lower accuracy than supervised counterparts due to weaker priors. This motivates an unsupervised, physics-aware approach that embeds the accurate forward model directly in training to remove the dependence on labeled real-space images.
Methodology
AutoPhaseNN couples a 3D convolutional encoder–decoder network with an embedded X-ray scattering forward model to enable unsupervised training from diffraction intensities alone. Inputs are 3D diffraction magnitudes of size 64×64×64. The CNN encoder processes the input and fans into two deconvolutional decoders that output 64×64×64 real-space amplitude and phase volumes. Architectural elements include convolution blocks (two 3×3×3 convolutions with leaky ReLU activation and batch normalization), max pooling, upsampling, and zero padding; final activations (Sigmoid for amplitude and Tanh for phase) constrain outputs physically. The CNN outputs are combined into a complex object, passed through the differentiable X-ray scattering model (Fourier transform and support constraints) to generate estimated reciprocal-space intensities. The training loss is the mean absolute error between measured and estimated intensities: Loss = Σ|Ie − Im| / N^3, where N is the output dimension and Ie/Im are estimated/measured intensities. No ground-truth real-space images are used at any stage. Training data comprise unlabeled simulated diffraction (from a physics-informed pipeline based on atomistic structures) and a small set of experimental BCDI datasets; details are in Methods. After training, the forward-physics block is discarded and only the CNN is used for single-shot inversion from diffraction to real-space amplitude and phase. Baselines and refinement: Conventional iterative phase retrieval uses cycles of error reduction (ER) and hybrid input-output (HIO) with shrink-wrap support: 20 ER + 160 HIO + 20 ER per cycle, for three cycles (600 iterations). A refinement procedure initializes ER (50 iterations) from AutoPhaseNN’s predicted complex image and a support derived by thresholding the predicted amplitude, yielding a fast, high-quality reconstruction.
Key Findings
- AutoPhaseNN achieves single-shot 3D BCDI inversion without labeled real-space images, by training with the embedded X-ray scattering forward model. - Speed: On experimental data, AutoPhaseNN runs ~100× faster than iterative phase retrieval (about 200 ms per prediction on a CPU vs ~28 s for 600 iterations). - Quality: On simulated tests, network predictions achieve high image quality (e.g., SSIM values near 0.99 and low chi-squared errors) comparable to conventional phase retrieval; refinement starting from AutoPhaseNN further improves detail and errors. - Refinement: Initializing ER from the network prediction yields reconstructions after only 50 ER iterations that match or surpass 600-iteration phase retrieval quality, delivering ~10× overall speedup versus iterative methods alone. - Convergence behavior: Error-versus-iteration curves show the refinement starts at lower reconstruction error and converges faster than conventional PR, indicating the learned prior effectively guides ER toward better minima. - Robustness checks: Free R-factors (free Poisson LLK and X_free) indicate the refinement with the learned prior provides the best unbiased performance among methods evaluated.
Discussion
Embedding the accurate X-ray scattering forward model into the training loop allows AutoPhaseNN to learn a reliable inverse mapping from diffraction intensities to real-space amplitude and phase without ground-truth labels. The approach addresses the key bottlenecks of conventional phase retrieval—slow convergence and dependency on iterative algorithms—by providing real-time-capable reconstructions with comparable quality. Moreover, using the network output as a learned prior materially improves and accelerates iterative refinement, reducing the need for algorithm switching (HIO/ER) and helping avoid poor local minima. These advances have broad implications for real-time coherent diffraction imaging workflows, especially in high-throughput or in situ experiments at upgraded synchrotrons and XFELs, and suggest generalizability to other inverse problems where a reliable forward model is available.
Conclusion
The study introduces AutoPhaseNN, an unsupervised, physics-aware 3D CNN for phase retrieval in BCDI that learns directly from diffraction data by incorporating the X-ray scattering forward model during training. Once trained, the CNN alone produces rapid, high-quality real-space amplitude and phase reconstructions, delivering ~100× speedup over 600-iteration phase retrieval. When used as an initialization for a brief ER refinement (50 iterations), the pipeline attains quality comparable to or better than conventional reconstructions at ~10× faster runtime. Future work includes expanding and diversifying training data (e.g., varying oversampling ratios, crystal structures and space groups, and defect types), improving generalization across sample types, and exploring online training/fine-tuning during data acquisition to adapt to experimental variations.
Limitations
- Training data diversity: Simulated training data currently derive mainly from fcc gold atomistic structures; broader datasets are needed to enhance generalization across materials, crystal symmetries, oversampling ratios, and defect structures. - Generalization scope: A single network may not optimally handle all sample types; fine-tuning on small experimental sets is required in practice. - Dependence on forward model fidelity: Performance relies on an accurate and differentiable X-ray scattering model; model mismatch or experimental artifacts not captured in the forward physics may degrade results. - Operational setup: The present workflow is trained offline; although feasible, fully online adaptive training during experiments remains to be demonstrated. - Overfitting concerns: Addressed via free R-factor metrics, but continued validation on diverse experimental datasets is necessary.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny