logo
ResearchBunny Logo
Deep quantum neural networks on a superconducting processor

Physics

Deep quantum neural networks on a superconducting processor

X. Pan, Z. Lu, et al.

This groundbreaking research by Xiaoxuan Pan and colleagues showcases the training of deep quantum neural networks on a six-qubit superconducting processor, achieving remarkable mean fidelity and accuracy. Their findings are pivotal for advancing quantum machine learning applications.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates whether deep quantum neural networks (DQNNs) can be efficiently trained on near-term quantum hardware using a quantum analogue of the backpropagation (BP) algorithm. Motivated by the success of deep learning and the potential quantum advantages in machine learning, the authors explore a layer-by-layer DQNN architecture that enables gradient evaluation using only adjacent layers. The purpose is to experimentally validate training of multi-layer quantum models on a superconducting processor, assess performance on learning quantum channels and molecular ground state energy, and examine hardware constraints. The work is significant because it demonstrates practical training of deep quantum models without requiring the number of coherent qubits to scale with depth, suggesting a scalable path for quantum machine learning on noisy intermediate-scale quantum devices.
Literature Review
The paper situates the work within advances in deep learning and quantum machine learning. Prior results established deep neural networks’ effectiveness via BP, and theoretical and experimental progress in quantum ML has included quantum speedups in classification and generative models, evidences of expressive power of quantum neural networks, and implementations such as quantum convolutional neural networks and quantum adversarial learning. The DQNN architecture with layer-wise perceptrons trained via a quantum BP analogue has been proposed previously, and various forms of quantum perceptrons have been experimentally demonstrated across platforms. This work builds upon those proposals by experimentally training DQNNs on superconducting hardware, using restricted perceptrons amenable to noisy devices while preserving the key BP property that only adjacent-layer information is required for gradients.
Methodology
Architecture and training framework: The DQNN is a layer-by-layer model mapping an input quantum state through L hidden layers to an output layer. Each perceptron between adjacent layers acts on a qubit pair and comprises two parameterized single-qubit Rx rotations followed by a fixed controlled-Phase (CZ-like) two-qubit gate. The layer operation is the ordered product of perceptrons. The model induces a sequence of completely positive trace-preserving maps (forward channels) from input to output. Gradients of the objectives (mean fidelity for channel learning or energy expectation for VQE-like tasks) can be computed using only the forward state on layer l−1 and a backward term on layer l obtained by applying adjoint channels, enabling a BP-like training rule localized to adjacent layers. Hardware and setup: Experiments use a programmable six-qubit superconducting processor with frequency-tunable two-junction transmon qubits, each with individual flux and XY control and readout resonators. Readouts share a common line with a Josephson parametric amplifier for high-fidelity single-shot measurement. Two half-wavelength bus resonators mediate inter-layer couplings for two-qubit gates. Characterized gate fidelities: single-qubit Rx >99.5%; two-qubit gates average 98.4%. The chip layout is optimized for layer-by-layer operations. Average characteristic qubit coherence time is ~7.5 µs; typical DQNN execution time is ~1.2 µs. Training protocol (hybrid): The forward process runs on the quantum processor; the backward process is classically simulated. For each training sample: (1) prepare input; (2) apply layerwise forward channels to obtain intermediate and output states via quantum state tomography; (3) classically initialize the output-layer backward term from the measured output and target (for channel learning) or from the output state (for energy minimization), and propagate backward through adjoint channels to obtain backward terms on preceding layers; (4) compute gradients for parameters in adjacent layers and update them (stochastic gradient descent over the dataset); (5) repeat for a fixed number of iterations. Tasks and ansätze: Two DQNN configurations are realized. DQNN1: a three-layer network with two qubits per layer (total six qubits laid out in three layers) to learn (i) a two-qubit quantum channel and (ii) the ground-state energy of H2 at a bond length of 0.075 nm using the two-qubit Bravyi–Kitaev-reduced Hamiltonian HBK = g0 I + g1 Z0 + g2 Z1 + g3 Z0Z1 + g4 Y0Y1 + g5 X0X1 (coefficients from prior literature). DQNN2: a six-layer network using one qubit per layer (reusing physical qubits layer-by-layer) to learn a one-qubit quantum channel. To ensure the target channels lie within the representational capacity, targets are generated by the same ansatz with randomly drawn parameters. Datasets and measurements: For two-qubit channel learning with DQNN1, four input states {|00>, |01>, |++>, |+i +i>} form the training set; fidelities between output of the trained DQNN and target are averaged over the four inputs. Generalization tests use 100 randomly generated input states via single-qubit rotations with random axes in the x–y plane and random angles. For one-qubit channel learning with DQNN2, training inputs are {|0>, |1>, |−>} and testing uses 100 random single-qubit states generated similarly. For H2 energy learning, the input is |00>, and the energy tr(ρ_out H) is measured from tomographically reconstructed ρ_out. Qubit reuse and depth-independence: Because gradient evaluation only uses two adjacent layers, the experiment maintains coherence for at most two layers at a time. After operations between layers, qubits in the previous layer are reset to |0> and reused as subsequent-layer qubits, reducing coherent-qubit requirements independent of depth (at the cost of added reset time and potential error accumulation). Numerical modeling: Simulations assess effects of decoherence and residual ZZ interactions. Single-qubit Rx(θ) gates are modeled via a driven Gaussian Hamiltonian with 40 ns duration (σ = 10 ns). Residual ZZ interactions during rotations are included via nearest-neighbor coupling with strength μ; decoherence is modeled with T1 and pure dephasing Tφ (via collapse operators). Simulated training of H2 uses 30 random initializations; abnormal runs converging to local minima are excluded when averaging.
Key Findings
- Three-layer DQNN (DQNN1) learned a two-qubit target quantum channel with mean fidelity up to 96.0% across training runs (best of 30 initializations). Classical noiseless simulations yield average converged mean fidelity >98%, indicating experimental imperfections as the main gap. - Generalization for DQNN1 (channel): On 100 random inputs, the trained model achieved 43% of fidelities >0.95 and 95% >0.9 versus the target channel; the untrained model showed significantly lower fidelities. - Molecular hydrogen (H2) ground-state energy with DQNN1: Minimum experimental energy estimate reached below −1.727 hartree, achieving up to 93.3% accuracy relative to the theoretical ground-state energy −1.851 hartree. Over 30 initializations, six runs exceeded 90% accuracy, with convergence within ~20 iterations. - Six-layer DQNN (DQNN2) for one-qubit channel learning achieved mean fidelity up to 94.8% across different initializations; fidelity distribution for 100 random test inputs concentrated around 0.92 after training and was strongly separated from the untrained distribution. - Hardware performance: single-qubit Rx gate fidelities >99.5%; two-qubit gate average fidelity 98.4%; DQNN runtime ~1.2 µs versus average qubit coherence ~7.5 µs. - Numerical sensitivity: With no residual ZZ, decoherence at T/T0=1 degraded average energy accuracy by ~6% to −1.74 hartree. With residual ZZ μ/2π≈1 MHz and T/T0=1 (close to experiment), the average estimate was −1.53 hartree (~17% above the exact value), matching experimental trends and indicating residual ZZ as the dominant error source compared to decoherence under the given timing. - Architectural merit: Demonstrated that only adjacent-layer information is needed for gradients, enabling qubit reuse so that the number of coherent qubits required does not scale with network depth.
Discussion
The experiments demonstrate that DQNNs can be trained effectively on current superconducting hardware using a BP-like algorithm that localizes gradient computations to adjacent layers. This addresses the core question of whether deep, layer-wise quantum models can be optimized despite hardware noise and limited coherence. High fidelities in quantum channel learning and accurate H2 energy estimates validate both the ansatz and training protocol. The strong separation between trained and untrained performance on random test states indicates genuine learning and generalization within the ansatz class. Numerical analyses corroborate experimental observations, identifying residual ZZ interactions as a primary limitation, while decoherence is less impactful given short circuit durations. The layer-local gradient rule enables qubit reuse and avoids scaling coherent qubit counts with depth, a practical advantage for deeper architectures on NISQ devices. Although the backward pass was performed classically here, the method is compatible with a fully quantum implementation, suggesting a path to end-to-end quantum training as hardware matures.
Conclusion
This work experimentally trains deep quantum neural networks on a six-qubit superconducting processor using a quantum backpropagation framework with quantum forward passes and classical backward simulations. The approach achieves high-fidelity learning of quantum channels and accurate estimation of the H2 ground-state energy, while demonstrating a key scalability property: only two adjacent layers are needed coherently at any time, enabling depth-independent coherent qubit requirements via qubit reuse. Numerical studies identify residual ZZ interactions as the dominant error source under current conditions. Future directions include implementing the backward pass on quantum hardware, scaling to wider and deeper DQNNs, improving hardware with tunable couplers and higher-coherence qubits, and applying DQNNs to broader quantum tasks such as channel learning, dynamics, and quantum chemistry problems.
Limitations
- Backward process executed classically; a fully quantum backward implementation was not realized experimentally. - Restricted perceptron form (Rx rotations + fixed controlled-phase) may limit expressivity compared to fully general perceptrons. - Performance limited by experimental imperfections, notably residual ZZ interactions between qubits; decoherence also contributes but is less dominant for the given durations. - Qubit reset and reuse introduce additional time and potential error accumulation, imposing practical depth limits on noisy devices. - Some training runs converge to local minima (observed numerically for a subset of initializations), affecting consistency. - Reported results focus on channels constructed from the same ansatz; out-of-ansatz targets were not explored here.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny