logo
ResearchBunny Logo
Experimental quantum end-to-end learning on a superconducting processor

Computer Science

Experimental quantum end-to-end learning on a superconducting processor

X. Pan, X. Cao, et al.

This groundbreaking research by Xiaoxuan Pan, Xi Cao, and colleagues presents the experimental implementation of quantum end-to-end machine learning on a superconducting processor, achieving impressive accuracy rates for recognition of handwritten digits. Discover the transformative potential of this approach for future complex tasks in the realm of quantum computing.

00:00
00:00
~3 min • Beginner • English
Introduction
Quantum computing promises significant advances for machine learning by exploiting quantum parallelism and high-dimensional Hilbert spaces. While exponential speed-ups have been proposed for certain tasks on fault-tolerant devices, near-term advantages on noisy intermediate-scale quantum (NISQ) hardware may still be achievable due to enhanced model expressibility provided by multi-qubit quantum states. A key step in deploying quantum ML on NISQ processors is constructing a parameterized quantum ansatz trainable by classical optimizers. Most existing approaches use gate-based quantum neural networks (QNNs) composed of parameterized layers, which align with circuit models but often depend critically on architecture design and mapping to native gates. Non-optimized architectures can waste limited coherence resources, hindering high accuracy without reducing dataset size. Hardware-efficient strategies, such as architecture optimization and qubit mapping, can help, but a more hardware-friendly end-to-end scheme was recently proposed that replaces gate-based QNNs with quantum dynamics driven directly by coherent control pulses. This approach requires minimal architecture design, calibration, and no qubit mapping. Furthermore, a jointly trained data encoder can transform classical inputs to quantum control pulses, simplifying state preparation and introducing beneficial nonlinearity via the control-to-state mapping. Pulse-based ansatzes have gained attention in NISQ applications, including state preparation, landscape studies, and cloud-based training. In this work, the authors experimentally demonstrate quantum end-to-end learning on a superconducting processor for MNIST digit recognition, using the full 784-pixel images without downsizing. They achieve 98% accuracy for 2-digit classification with two qubits and 89% for 4-digit classification with three qubits, highlighting the scalability and efficiency of the end-to-end, hardware-friendly model.
Literature Review
The paper situates its contribution within quantum machine learning by referencing proposals of quantum speed-ups for ML tasks on fault-tolerant hardware and the potential for near-term benefits from the expressibility of quantum states on NISQ devices. It reviews gate-based QNNs used for classification, clustering, and generative modeling, noting their dependence on circuit architecture and native-gate mapping, which can limit performance on NISQ hardware. The authors highlight hardware-efficient alternatives, including circuit architecture optimization and qubit mapping, and emphasize a recently proposed end-to-end learning framework that parameterizes the quantum ansatz via control pulses rather than discrete gates. This pulse-based approach reduces the need for architecture design and calibration and removes qubit mapping. They also note joint training of a classical encoder to map data into control pulses, leveraging the nonlinearity of control-to-state mappings to improve expressibility. Prior related works include pulse-based state preparation, analysis of variational landscapes from a control perspective, and pulse-ansatz frameworks on NISQ machines. The study builds directly on these ideas to provide an experimental demonstration on a superconducting platform.
Methodology
End-to-end model and training: The model encodes classical data x into quantum states via a classical linear encoder W that maps x to control variables θ for an encoding block, followed by an inference block parameterized by η. The quantum state evolution is driven directly by control pulses under a time-dependent Hamiltonian H(t) = H0 + Σm θm(t) Hm, avoiding gate compilation. Training proceeds iteratively with mini-batches (batch size b = 2), where the loss function L (based on conditional classification probabilities) and its gradients are estimated and used by a classical optimizer (Adam) to update W and the control parameters. Quantum dynamics and controls: The system Hamiltonian for the active qubits in the interaction picture is H0/ħ = Σp Σq Jpq(ap† aq + aq† ap) − Σq (Ec,q/2) aq† aq aq† aq, where Jpq are bus-mediated couplings, Ec,q are anharmonicities, and aq are qubit annihilation operators. Controls per qubit and per layer correspond to rotations along the x and y axes: H2q−1 ∝ (a†q + aq)/2 and H2q ∝ i(aq − a†q)/2, driven by resonant microwave sub-pulses with Gaussian envelopes of fixed width 40 ns. All controls for a time slice are applied simultaneously. QNN architecture: The network comprises an encoding block with E = 2 layers and an inference block with l = 2 layers. For an N-class task, M = floor(log2 N) + 1 qubits are used: log2 N label qubits for readout via majority vote over computational basis measurements, plus one auxiliary qubit to enhance expressibility. The total number of control parameters is 8M across encoding and inference blocks (four sub-pulses per block, two control axes per qubit). Tasks and hardware: Experiments run on a six-qubit flux-tunable Xmon processor with inductively coupled flux-bias lines and capacitively coupled RF lines. Q1, Q2, Q4, Q5 couple to bus cavity B1; Q2, Q3, Q5, Q6 couple to bus cavity B2. Each qubit has a dedicated readout resonator coupled to a common line for multiplexed readout. Irrelevant qubits are detuned. Two tasks are implemented: (1) 2-digit (0 vs 2) with qubits Q3 (6.08 GHz) and Q5 (6.45 GHz) at flux sweet spots, coupling J35/2π = 4.11 MHz; Q5 serves as the label qubit. (2) 4-digit (0, 2, 7, 9) with Q3 (6.08 GHz), Q5 (6.45 GHz), and Q6 (6.19 GHz); Q3 and Q5 are label qubits. Training procedure: Initialization uses W0 with elements 1e−5 and θ0 inducing π/4 rotations. In each iteration, L is measured on label qubits. Gradients for inference controls are obtained by finite differences via small perturbations of η; encoding gradients gW are computed efficiently from measured gradients with respect to encoding controls and the input vector. Gradients are averaged over b = 2 samples to reduce variance. Parameters are updated using Adam with standard hyperparameters. RF control waveforms (I/Q) are generated by a Tektronix AWG 70002A at 25 GHz and delivered via RF lines. Classification outputs are obtained by repeating measurements 5000 times per sample and applying majority vote across label qubits. Simulation: Numerical simulations use the calibrated Hamiltonian, identical training batches, and the same update rules to compare with experiments. Additional simulations study the effect of pulse length t and qubit coherence times T1 and T2 on confidence (1 − L), revealing trade-offs between entanglement growth and decoherence. Data flow analysis: Linear Discriminant Analysis (LDA) is applied to project data distributions at various points in the pipeline: raw images, encoded control pulses, post-encoding quantum states, and final states, to visualize cluster separability and quantify standard deviations after normalizing cluster-center distances.
Key Findings
- High-accuracy classification on hardware: Achieved 98.6 ± 0.1% accuracy experimentally for 2-digit (0 vs 2) classification using two qubits; simulations yielded 98.2%. For 4-digit (0, 2, 7, 9) classification using three qubits, achieved 89.4 ± 1.5% experimentally; simulations yielded 88.9%. - Training convergence: With E = 2 encoding and l = 2 inference layers, the loss converged to 0.14 (2-digit) after ~300 iterations and 0.22 (4-digit) after ~500 iterations. Increasing encoding depth E can potentially further reduce loss. - Hardware-efficient, pulse-based control: Implemented a gate-free, pulse-parameterized QNN with Gaussian 40 ns sub-pulses, directly driving x- and y-axis rotations per qubit per layer. Control parameters were generated by a classical encoder W and optimized jointly with inference controls using Adam. - Entanglement–decoherence trade-off: Simulations varying pulse length t and coherence times showed that average confidence (1 − L) initially increases with t for sufficiently large T1 or T2 (e.g., T1 = 20 μs), then decreases as decoherence dominates, indicating an optimal control duration and layer count. - Role of encoder vs QNN (LDA analysis): After normalizing cluster-center distances, standard deviations were: raw data 0.1658; control pulses 0.2903 (compression but increased spread); encoded quantum states 0.0919 (sharp reduction via nonlinear quantum mapping); final states showed no further improvement. This indicates the classical encoder compresses inputs, while the QNN accomplishes the bulk of class separation. - Agreement with simulations and robustness: Experimental training curves closely matched simulations; minor deviations attributed to simplified modeling of higher-order couplings and parameter drift. - System specifics: For the 2-digit task, Q3 and Q5 operated at 6.08 and 6.45 GHz (flux sweet spots) with effective coupling J35/2π = 4.11 MHz; labels read out via majority vote with 5000 shots per sample. For the 4-digit task, Q3, Q5, Q6 at 6.08, 6.45, 6.19 GHz, with Q3 and Q5 as label qubits.
Discussion
The study addresses the challenge of building effective, hardware-friendly quantum ML models for NISQ devices by demonstrating an end-to-end framework that jointly trains a classical data encoder and a pulse-parameterized QNN. By eliminating circuit compilation and qubit mapping, the approach directly leverages device-native controls, improving utilization of limited coherence resources. Empirical results show high accuracy on MNIST subsets without image downsizing, validating that the nonlinear control-to-state mapping enhances expressibility and that the trained dynamics can implement effective classification boundaries. The LDA analysis clarifies the division of labor: the classical encoder compresses the high-dimensional inputs, while the quantum evolution concentrates class clusters for separability. Simulations and experiments concur, and the observed dependence of confidence on pulse duration and coherence times underscores a key practical trade-off: longer pulses can increase entanglement and expressibility but are limited by decoherence, motivating optimal control durations and architecture depths. While no quantum advantage over classical algorithms is claimed, the demonstrated hardware efficiency and scalability suggest that, with more qubits and lower noise, end-to-end pulse-based learning could approach regimes where quantum expressibility yields performance benefits in more complex tasks.
Conclusion
The work experimentally demonstrates quantum end-to-end machine learning on a superconducting processor using a pulse-parameterized QNN jointly trained with a classical encoder. Without downsizing 784-pixel MNIST images, the system attains 98% accuracy for 2-digit classification using two qubits and 89% for 4-digit classification using three qubits, with results consistent with calibrated simulations. The approach is hardware-friendly, avoids circuit compilation and qubit mapping, and effectively exploits limited NISQ resources. Analyses reveal that the classical encoder performs input compression while the quantum dynamics establish class separability, and simulations highlight an optimization trade-off between entanglement generation and decoherence. Future research directions include scaling to larger qubit counts, optimizing pulse sequences and layer depths, improving coherence and control calibration, and extending the framework to unsupervised and generative learning tasks where quantum advantages may emerge.
Limitations
- No claim of quantum advantage over classical ML; results are proof-of-principle on limited-qubit NISQ hardware. - Small number of classes and qubits (2 and 4 digits; 2–3 qubits) restricts task complexity and generalizability to larger problems. - Performance depends on coherence times; longer pulses or deeper networks face decoherence-induced degradation, requiring careful optimization of control duration and depth. - Minor discrepancies between experiment and simulation attributed to simplified modeling of higher-order couplings and system parameter drift. - Gradients estimated via finite differences increase experimental overhead; while mitigated by batch averaging and encoder-gradient computation, scalability of gradient estimation remains a practical concern.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny