Computer Science

Quantum Deep Generative Prior with Programmable Quantum Circuits

T. Xiao, X. Zhai, et al.

Discover the groundbreaking Quantum Deep Generative Prior (QDGP) algorithm developed by Tailong Xiao, Xinliang Zhai, Jingzheng Huang, Jianping Fan, and Guihua Zeng. This innovative approach utilizes programmable quantum circuits to significantly enhance image reconstruction and generalization in computer vision tasks, surpassing classical methods, especially at low sampling rates. Dive into the quantum revolution in imaging!

00:00

~3 min • Beginner • English

Index

Introduction

The study addresses whether near-term, programmable quantum circuits can confer practical advantages in high-dimensional generative modeling and inverse problems by serving as learnable priors. Classical deep generative methods (e.g., DIP and DGP) leverage neural network structure and Gaussian latent spaces but face limitations in generalization and diversity, especially under low data regimes and out-of-distribution (OOD) scenarios. Parameterized quantum circuits (PQCs) offer access to exponentially large feature spaces and distributions that are challenging for classical simulation, suggesting potential benefits as a quantum latent prior. The authors propose Quantum Deep Generative Prior (QDGP), which integrates a PQC-induced latent space with a pretrained BigGAN generator and task-specific loss functions (including physics-informed objectives) to enhance reconstruction quality, diversity, and OOD generalization across optical ghost imaging and standard computer vision tasks. The work aims to demonstrate practical hybrid quantum-classical utility on near-term devices.

Literature Review

Prior quantum algorithms (e.g., Shor and Grover) demonstrate theoretical speedups but require specific structures and fault-tolerant hardware, limiting near-term applicability. Within quantum machine learning, two main paradigms have emerged: quantum kernel methods and PQC-based neural networks. Quantum GANs have shown strengths in learning discrete distributions and quantum data, yet current hardware constrains PQC depth/width, restricting prior demonstrations to low-dimensional outputs. A key step towards high-dimensional data generation showed a quantum circuit could learn the prior distribution of a classical GAN to improve image quality (Rudolph et al.). On the complexity front, learning distributions from quantum circuits can be hard for classical computers, indicating possible quantum advantages for distribution learning. Quantum annealers and Boson sampling have been explored for generative modeling but lack programmability comparable to PQCs. Classical priors such as DIP exploit CNN structure as an implicit prior but generalize poorly beyond the observed image statistics; DGP extends DIP by learning Gaussian latent spaces jointly with the generator. However, Gaussian priors may be too restrictive for OOD generalization. Recent theoretical results suggest PQCs can model continuous multivariate distributions with strong expressivity, motivating their use as programmable quantum priors to extend GAN manifolds toward natural image manifolds.

Methodology

Overview: QDGP uses a parameterized quantum circuit to generate a learnable latent distribution, which is fed into a pretrained BigGAN generator. The PQC and generator parameters are jointly optimized using task-specific losses, including physics-informed terms for computational imaging and feature/MSE losses for standard vision tasks. Quantum latent space: A PQC U(θ) acts on an initial state ρ0 to produce ρ(θ)=U(θ)ρ0U(θ)†. Observables M are measured to obtain expectations ⟨M⟩=Tr(ρM), which serve as latent codes z. PQCs can learn Gaussian distributions and more general ones, potentially approximating any continuous multivariate distribution (see Supplementary Note 5). Encodings such as Heisenberg-evolution can yield distributions hard to simulate classically. Optimization objective: For an original image x and degraded observation x′=I(x) (e.g., grayscale transform or a physical imaging process), the reconstruction problem is solved via GAN inversion with a programmable quantum prior. With pretrained generator G(ξ) and latent code z derived from PQC expectations, the objective minimizes a distance C between the degraded forward model applied to G(z) and the measured data: - General: minimize L(x′, I(G(⟨M(θ)⟩)); ξ, θ) - Ghost imaging (GI): degradation is linear H mapping images to bucket signals B, yielding the L2 loss plus total variation (TV) regularization: minimize ||B − HG(⟨M(θ)⟩; ξ)||² + λ||x||_{TV}, with x=G(⟨M(θ)⟩). This integrates inductive biases from physics (H) and data priors (G and PQC latent). Computer vision tasks: For category transfer, super-resolution, colorization, and inpainting, the framework minimizes a weighted sum of feature loss (using a fixed pretrained discriminator’s intermediate features) and MSE loss, while fine-tuning θ (quantum) and ξ (generator). For inpainting, the degradation is a Hadamard mask; for colorization, grayscale brightness is preserved and the model learns chroma; for super-resolution, the low-resolution input is obtained by downsampling, and the model generates the high-resolution output via the generative prior without a physical optics model. PQC architecture: The PQC comprises a data encoding layer E and trainable learning layers V. Angle encoding with data re-uploading is used: for a sampled vector s∈Rd, the state is prepared via Ry encodings and evolved by layers Vi(θi) consisting of single-qubit rotations and entangling gates (e.g., CZ). The final state |ψ(s,θ)⟩ is measured; typically Pauli-Z on each qubit is used to form the latent code. Gradients of PQC parameters are computed via parameter-shift; classical network gradients via automatic differentiation. Training employs stochastic optimization (e.g., Adam). To manage BigGAN-256’s 120-dimensional latent space on near-term hardware/simulation, the latent is split into 7 chunks, implemented via 7 subcircuits (≈18 qubits each), inspired by BigGAN’s hierarchical latent splitting. Matrix Product State (MPS) simulation with limited bond dimension is used for scalability analyses; state-vector simulations are used for smaller depths. Initialization and training details: BigGAN is used as a pretrained PyTorch model (trained on ImageNet). For tasks without labels (e.g., GI), a random class embedding is used; for labeled tasks, the correct class embedding is provided. To improve optimization, multiple random latent initializations (or PQC initializations) are evaluated and the one with lowest initial feature-loss is chosen. Physical ghost imaging experiments: Optical GI experiments collect bucket signals under varying sampling rates and resolutions (e.g., 64×64 and 128×128), with biological samples chosen to be OOD relative to ImageNet. The physical forward model H comprises random speckle illumination patterns and single-pixel detection. Measurements M span from 256 to 2048 in the 128×128 case. Diagnostics: Latent density evolution is visualized during optimization. Additional studies assess robustness to quantum noise (depolarizing channels), circuit depth effects, and barren plateau behavior, comparing MPS-based PQCs (low bond dimension) and state-vector PQCs across increasing layers.

Key Findings

- Ghost imaging (GI) reconstruction: - For a 64×64 “butterfly” sample, QDGP consistently outperforms DGP and classical differential GI (DGI) across PSNR and SSIM, especially as measurements increase (e.g., beyond 2^n, ~12.5% sampling rate). Pretrained models require more measurements to fine-tune effectively; nevertheless, QDGP exceeds DGP under comparable settings. - For a 128×128 “wasp wing” sample (OOD from ImageNet) over measurements M∈{256,512,1024,2048}, QDGP maintains SSIM advantages over DGP and preserves the ML advantage over DGI at all M. In PSNR, pretrained QDGP and DGP both exceed randomly initialized models and DGI, with QDGP showing higher PSNR than DGP. Visual reconstructions show QDGP produces sharper biological structures while DGI remains noisy. - Latent space behavior: - Gaussian (DGP) latent densities remain concentrated near [−0.5,0.5] and close to Gaussian before/after optimization, while QDGP latent densities are more decentralized and show larger evolution, indicating higher flexibility that aids OOD optimization. - Robustness and trainability: - No barren plateau observed within tested depths (loss decreases smoothly from 6 to 10 layers). Excessive depth in state-vector PQC reduces PSNR, while MPS-PQC with low bond dimension yields stable PSNR across layers, suggesting constrained expressivity can avoid barren plateaus and serve as a diagnostic for selecting suitable depths. - Practical quantum noise (depolarizing) degrades performance modestly; noisy QDGP still shows strong performance since the quantum prior increases latent diversity but does not directly generate pixels. - Image restoration/manipulation: - Category transfer: QDGP and DGP both achieve convincing transfers (e.g., dog↔cat to other classes), preserving pose/size/layout; differences appear in fine details (hair/faces), with no clear overall superiority. - Inpainting: QDGP often outperforms DGP, including without class conditioning [e.g., PSNR gains of several dB: 12.73→16.74 (row 2, QDGP(−1) vs DGP(−1)); 17.16→21.25 (row 3, QDGP(−1) vs DGP(−1))], with corresponding SSIM boosts, especially on non-ImageNet images and non-periodic patterns. The quantum latent increases diversity, improving the chance to plausibly fill missing regions. - Colorization: QDGP generally surpasses DGP in PSNR/SSIM and exhibits better color/style consistency (e.g., examples: PSNR/SSIM improvements such as 17.24/0.8471→19.06/0.8918; 19.35/0.8951→20.51/0.8917; 19.22/0.8261→21.13/0.8625), with more natural local details. - Super-resolution (×4 from 64×64 to 256×256): QDGP slightly outperforms DGP in PSNR/SSIM with visually similar quality; gains are modest (e.g., 24.72/0.8545→24.84/0.8635; 22.85/0.7835→23.51/0.7887).

Discussion

The results indicate that employing a PQC as a programmable quantum prior enriches the latent distribution beyond Gaussian, enabling the generator to better navigate and extend the GAN manifold toward target images, particularly under low-dimensional constraints and OOD scenarios. In ghost imaging, combining a physics-informed loss (via H) with a pretrained generator and quantum prior reduces required sampling and improves reconstruction quality over classical DGP and DGI, demonstrating practical hybrid quantum-classical benefits. The PQC’s flexibility appears to decouple inherent generator patterns from the latent structure, enhancing diversity, which is particularly advantageous for inpainting and colorization where multiple plausible solutions exist. Robustness analyses suggest moderate quantum noise does not negate benefits, as the quantum latent primarily shapes the search space rather than directly synthesizing pixels. Depth studies reveal that properly chosen circuit expressivity avoids barren plateaus and can be tuned (e.g., via MPS diagnostics) to maintain trainability, underscoring a path to scalable near-term implementations. Overall, the findings support the hypothesis that quantum-enhanced latent spaces can measurably improve generalization and reconstruction across tasks, leveraging both data priors and physics-based inductive biases.

Conclusion

The paper introduces QDGP, a hybrid quantum-classical deep generative method that integrates a PQC-induced latent prior with a pretrained BigGAN generator and physics/feature-based losses. Across optical ghost imaging and several computer vision tasks, QDGP improves reconstruction quality, diversity, and OOD generalization relative to classical DGP and traditional methods (e.g., DGI), with especially strong gains in inpainting and colorization. Analyses of latent distributions and training dynamics indicate that the programmable quantum prior enhances expressivity without prohibitive trainability issues within tested depths, and remains robust to practical noise rates. The approach is well-suited to near-term quantum devices and hybrid cloud settings and can be extended to inverse problems where forward models are known but inverses are challenging. Future work will investigate quantum-enhanced generative AI at larger scales, deeper analyses of generalization, and leveraging more advanced hardware to directly generate higher-dimensional quantum latent codes, as well as domain adaptation and transfer learning applications.

Limitations

- Hardware and scalability: Current PQC depths/widths are limited; simulating full 120-qubit latents is infeasible with state vectors, requiring circuit cutting and MPS approximations that constrain expressivity. - Sensitivity to circuit depth: Excessive depth can degrade image PSNR; selecting an appropriate depth is crucial to avoid barren plateaus or over-expressivity. - Quantum noise: While moderate noise has limited impact, higher noise rates reduce expressivity and may approach uniform distributions, diminishing gains. - Dependence on pretrained models and labels: Performance with pretrained BigGAN can suffer at very low sampling or when class labels are unavailable/mismatched (e.g., GI), necessitating more measurements or careful initialization. - Generalization scope: Although OOD improvements are demonstrated (e.g., biological samples), comprehensive benchmarks across diverse domains and tasks are needed to fully characterize limits and robustness. - Optimization cost: Parameter-shift gradient estimation incurs O(K·S) overhead (K parameters, S shots), potentially expensive on hardware; multiple initialization trials add compute overhead.