Physics
An unsupervised deep learning algorithm for single-site reconstruction in quantum gas microscopes
A. Impertro, J. F. Wienand, et al.
A deep convolutional neural network algorithm has been developed by researchers Alexander Impertro, Julian F. Wienand, Sophie Häfele, Hendrik von Raven, Scott Hubele, Till Klostermann, Cesar R. Cabrera, Immanuel Bloch, and Monika Aidelsburger for accurately reconstructing site-resolved lattice occupation in quantum gas microscopy. Achieving an impressive reconstruction fidelity of 96%, this algorithm holds significant potential for enhancing imaging systems in future experiments.
~3 min • Beginner • English
Introduction
The study addresses the challenge of accurately reconstructing site-resolved lattice occupations from fluorescence images in quantum gas microscopes, particularly when lattice spacing is significantly smaller than the imaging resolution and when signal-to-noise ratio (SNR) is limited. Traditional deconvolution techniques become ill-conditioned under noise and degrade rapidly when the resolution-to-spacing ratio β exceeds about 2. The research proposes an unsupervised deep learning approach using a convolutional autoencoder to perform deconvolution and capture nonlinear, density-dependent effects (e.g., superradiance) directly from experimental data. The purpose is to achieve high-fidelity, fast reconstructions without relying on simulated training data, thereby improving the accuracy of extracted physical observables and enabling experiments with shorter lattice spacings or specialized lattice geometries.
Literature Review
Prior approaches for site-resolved reconstruction in quantum gas microscopy include iterative least-squares methods (computationally expensive, limited fidelity at low SNR), linear kernel deconvolution designed to minimize spillover (fast but assumes a single kernel independent of neighbor occupancy), and image restoration techniques like Wiener filtering and Richardson–Lucy deconvolution, whose fidelity declines for β ≥ 2. Enhanced deconvolution using lattice constraints and noise models has been demonstrated in 1D. Inverse problems are challenging due to noise; neural networks can approximate nonlinear mappings and offer fast evaluation. A supervised neural-network approach was previously proposed and benchmarked on simulations, but performance on experimental data is limited by simulation accuracy. This work advances the field by providing an unsupervised method trained directly on experimental images, avoiding simulation mismatch.
Methodology
- Problem formulation: Map high-dimensional noisy fluorescence images to low-dimensional binary lattice occupation (dimensionality reduction) using a convolutional autoencoder trained unsupervised by reconstructing inputs while enforcing binarization in a bottleneck.
- Network architecture: A regularized convolutional autoencoder. Input is a 256×256 pixel crop corresponding to 16×16 lattice sites. Encoder: four convolutional layers with stride 2, ReLU activations, followed by a final convolution with stride 1 and tanh activation to produce a 16×16 bottleneck (site-wise outputs). Decoder: three transposed convolution layers (stride 2, ReLU) and a final transposed convolution (stride 1, tanh) to upsample back to input size. Entire network uses only convolutions to maintain spatial order; multiple stacked small-kernel layers improve performance and receptive fields across scales.
- Bottleneck and activation: The tanh activation in the bottleneck creates a bimodal distribution in pre-activation site counts corresponding to empty/occupied sites; separation quality indicates reconstruction fidelity. Binarization is naturally thresholded at zero due to tanh symmetry.
- Loss function: Composite loss L_tot(x,x′,y) = L1 reconstruction loss L1(x,x′) = Σ_pixel |x−x′| plus bottleneck regularization λ Σ_i |1−|y_i|| to penalize non-binary bottleneck values (N_sites = 16^2). Regularization forces learning of discrete site occupations.
- Experimental system and imaging: Cesium atoms in a 2D optical lattice with spacing a = 383.5 nm; microscope Rayleigh resolution ≈ 850 nm, β = 2.2. Fluorescence imaging via deep pinning lattice and optical molasses on D2 line; SNR defined as (C_atom − C_bg)/(σ_atom + σ_bg) measured as 5.2 for 300 ms exposure.
- Lattice grid extraction: Lattice basis vectors obtained from Fourier transform of fitted atom positions in dilute images; per-image lattice phase determined by fitting isolated atoms; ensures consistent site positions in crops.
- Data preparation: Images rotated to align axes; cut into 16×16-site crops; per-pixel values scaled to (−1,1) using global min–max from training set.
- Training data: ~100,000 experimental crops from clouds with fillings from ~0 to n ≈ 0.98. Trained for ~100 epochs with ADAM (initial lr 4×10^−4). Hyperparameters tuned: regularization λ_opt = 0.4; encoder kernel size ~10×10, decoder ~22×22 (little sensitivity beyond ~8×8). After training, only encoder used for reconstruction.
- Reconstruction pipeline: Full images decomposed into overlapping 16×16-site crops advanced by one lattice site. Encoder outputs pre-binarization 16×16 deconvolved site counts; counts reassembled by averaging overlaps, using only central 12×12 sites per crop to mitigate edge effects. Threshold at zero yields binary occupation.
- Decoder interpretability tests: With a single occupied site input, decoder output matches measured PSF size/shape; with blocks of occupied sites, decoder outputs brighter images than PSF convolution, capturing density-dependent superradiant enhancement. Quantitatively, decoder reproduces a ~22% higher mean signal at unity filling compared to low filling.
- Fidelity estimation methods:
1) Bimodal overlap: Fit two Gaussians to deconvolved count distributions (e.g., at half-filling) and estimate fidelity from overlap area.
2) Double-exposure: Acquire two consecutive images; compute probability β of differing site classifications; correct for hopping/loss p_s(n) to estimate F = 1 − 2β/(1 − 2 p_s(n)). Independent calibration p_s(n) = n − 5.9×10^−3.
- Methods details: Experimental setup uses NA 0.8 objective; fluorescence photons imaged onto sCMOS; preparation of different fillings by partial atom removal via microwave transfer and optical blowout.
Key Findings
- High-fidelity reconstruction achieved at challenging resolution-to-spacing ratio β = 2.2 (a = 383.5 nm, Rayleigh resolution ≈ 850 nm) with experimental SNR ≈ 5.2 at 300 ms exposure.
- Unsupervised training directly on ~100,000 experimental crops; no simulated training data required.
- Reconstruction speed: full images with several thousand sites reconstructed in under one second on standard hardware.
- Hyperparameters: optimal bottleneck regularization λ ≈ 0.4; encoder kernel ≈ 10×10; decoder kernel ≈ 22×22.
- Bimodal deconvolved counts: clear separation across fillings; Gaussian fit at half-filling yields estimated fidelity F ≈ 99%.
- Double-exposure benchmark: after correction for hopping/loss p_s(n) = n − 5.9×10^−3, fidelity exceeds 99% near n ≲ 0.2 and near unity filling; minimum fidelity around n ≈ 0.7 is 96.3(3)% (uncorrected minimum), consistent with overall ≥96% across fillings.
- Decoder learns experimental PSF and captures density-dependent superradiant effects, showing ~22% higher brightness at unity filling compared to low fillings.
- Practical pipeline choices (overlapping crops, central 12×12 site use) improve robustness and edge fidelity.
Discussion
The work addresses the ill-conditioned deconvolution problem in quantum gas microscopy by leveraging a convolutional autoencoder that learns a nonlinear mapping from images to site occupations, incorporating lattice discretization and experimentally observed density-dependent effects. Training directly on experimental data avoids simulation-to-reality gaps and detailed calibration of PSF and noise models. The strong separation of deconvolved count distributions enables threshold-based classification and provides a built-in diagnostic for fidelity estimation and monitoring. Double-exposure measurements quantify detection fidelity while accounting for hopping/loss, indicating ≥96% fidelity across fillings and ≥99% at low and near-unity fillings despite β = 2.2. The ability to reconstruct large images rapidly facilitates high-throughput experiments and more accurate extraction of many-body observables, enabling studies with shorter lattice spacings or complex geometries and potentially benefiting related platforms (e.g., Rydberg arrays, trapped ions).
Conclusion
An unsupervised deep-learning reconstruction algorithm based on a regularized convolutional autoencoder enables high-fidelity, fast site-resolved occupation readout from fluorescence images in quantum gas microscopes. Trained on experimental data, the method achieves ≥96% fidelity across fillings in a regime where lattice spacing is more than twice smaller than imaging resolution, captures nonlinear density-dependent effects without explicit modeling, and reconstructs large images in under a second. The approach removes the need for simulated training datasets and detailed calibration, and provides practical, experiment-based fidelity estimation tools via bimodal-deconvolved counts and double-imaging. This capability paves the way for experiments with shorter lattice spacings and exotic geometries, and is transferable to other quantum platforms such as Rydberg atom arrays and trapped ions. Future work could extend to even shorter spacings given higher SNR, refine systematic error quantification with experimental data, and explore adaptive or physics-informed regularization to further boost robustness.
Limitations
- Detection fidelity is limited by finite SNR and experimental systematics (e.g., spatially inhomogeneous fluorescence, hopping, and atom loss during imaging).
- The nonlinear encoder transformation makes the exact functional form of the deconvolved bimodal distributions a priori unknown, complicating precise fidelity quantification from overlaps.
- Double-exposure analysis is primarily sensitive to statistical (SNR-driven) errors and requires independent calibration of hopping/loss; there is no straightforward experimental method presented to fully quantify systematic errors.
- Edge effects reduce fidelity at crop borders, necessitating the use of only central 12×12 sites from 16×16 crops.
- Applicability to even shorter lattice spacings may require proportionally higher SNR; performance at substantially different imaging conditions was not experimentally demonstrated here.
Related Publications
Explore these studies to deepen your understanding of the subject.

