
Computer Science
Flexible learning of quantum states with generative query neural networks
Y. Zhu, Y. Wu, et al.
Discover the innovative GQNQ neural network developed by Yan Zhu and colleagues, which learns multiple quantum states from classical data! This powerful tool not only predicts measurements but also identifies unique phases of matter and clusters quantum states, paving the way for advanced quantum characterization.
~3 min • Beginner • English
Introduction
Accurate characterization of quantum hardware is essential for developing, certifying, and benchmarking quantum technologies. Traditional approaches include quantum state tomography, classical shadow estimation, partial state characterization, and quantum state learning. Recent neural-network-based methods have shown promise but typically require training on experimental data from the very state to be characterized, making the learned information state-specific and non-transferable. This limits efficiency in scenarios involving multiple states, such as clustering, classification, and cross-platform verification. The research question addressed here is whether a neural network can be trained offline on simulated data from a family of fiducial states and measurements, and then used to characterize new, structurally similar states by predicting unperformed measurement outcomes. The paper proposes GQNQ, a learner that constructs a data-driven state representation from partial measurement data and uses it for flexible prediction, online updates, and downstream tasks like clustering and phase identification.
Literature Review
The paper reviews several streams: (i) state characterization methods such as quantum tomography (including compressed sensing and efficient tomography), classical shadows, and partial characterization; (ii) neural-network approaches to state tomography and generative modeling (e.g., neural-network quantum state tomography, variational autoencoders, cGANs, and basis-dependent networks) that are typically trained per state; (iii) tasks involving multiple states (clustering, classification, cross-platform verification); and (iv) conceptual links to representation learning and the generative query network (GQN) for classical scenes. It contrasts GQNQ’s multi-state, offline training with prior state-specific training and notes conceptual similarity with Aaronson’s pretty good tomography, though GQNQ’s state representation is learned and can be lower dimensional for structured state families.
Methodology
Framework: Given an unknown quantum state ρ and a set of measurements M (each a POVM with n_o outcomes), the experimenter performs a random subset S={M_i} of size s, obtaining frequencies p_i for each M_i. Each measurement M is mapped to a parametrization enc(M)=m. The goal is to predict outcome probabilities for an unperformed measurement M'∈M\S from its parametrization m'.
Model (GQNQ): A generative query neural network adapted to quantum states with two components:
- Representation network f_θ: inputs pairs (m_i, p_i) for i=1..s and outputs vectors r_i=f_θ(m_i,p_i). An aggregate A computes the state representation r = (1/s) Σ_i r_i. No explicit state parametrization is needed; r depends only on statistics from the target state and fixed parameters θ after training.
- Generation network g_η: inputs the state representation r and a requested measurement parametrization m' and outputs predicted outcome probabilities p' = g_η(r, m'). The model does not require explicit POVM operators; an encrypted parametrization suffices.
Training (offline option): Train θ and η using simulated data from a fiducial set of states Q and measurements M. For each batch of B states, select a random subset M_b⊂M to feed as context (m,p) pairs; predict outcomes for M\M_b and optimize a loss comparing predictions with Born-rule probabilities via gradient descent (Adam). Training repeats for E epochs, typically B=30, E=200. After training, θ and η are fixed and reused across multiple target states and arbitrary subsets S chosen at test time.
Online prediction: Initialize r^(0)=0. At time step t, with new pair (m,p), compute r_e=f_θ(m,p) and update r^(t) = ((t−1) r^(t−1) + r_e)/t. Feed r^(t) and any query m' to g_η to update predictions without storing past data or performing costly tomography.
Experimental setups:
- Spin systems (Ising and XXZ chains): Use ground states for various couplings; evaluate with six-qubit full Pauli measurements (informationally complete) and with larger systems (10, 20, 50 qubits) using all two-qubit nearest-neighbor Pauli measurements; randomly choose s=30 measurements for representation; representation vector dimension set to 24.
- Continuous-variable (CV) systems: Use homodyne measurements of quadratures with phase θ∈[0,π). For evaluation, pick s=10 random homodyne settings. Outcomes are truncated to [−6,6] and discretized into 100 bins. Representation vector dimension set to 16. Consider noiseless data and noisy conditions (Gaussian noise on probabilities with variance 0.05, and Gaussian perturbations of θ with variance 0.05) and finite-statistics sampling (e.g., 50 or 10 shots in qubit experiments).
Data generation: For spin systems, ground states are computed exactly (6 qubits) or via DMRG (10, 20, 50 qubits). CV state data generated with Strawberry Fields. Training/test splits are specified per family (e.g., 40 training and 10 test states for each coupling in spin chains; for CV, 10,000 states per family split 4:1). Implementation uses PyTorch and Adam optimizer.
Key Findings
- Six-qubit experiments (Table 1; average classical fidelities of predicted vs ground-truth distributions):
  • Ising, ferromagnetic bias: 0.9870 (noiseless), 0.9869 (50 shots), 0.9862 (10 shots)
  • Ising, antiferromagnetic bias: 0.9869, 0.9867, 0.9849
  • Ising, no bias: 0.9895, 0.9894, 0.9894
  • XXZ, ferromagnetic bias: 0.9809, 0.9802, 0.9787
  • XXZ, XY phase bias: 0.9601, 0.9548, 0.9516
  • Mixed families (i–v): 0.9567, 0.9547, 0.9429
  • GHZ with local rotations: 0.9744, 0.9744, 0.9742
  • W with local rotations: 0.9828, 0.9826, 0.9821
  • Mixed (i–v, vii, viii): 0.9561, 0.9543, 0.9402
  Finite statistics degrade performance only slightly. Accurate predictions remain feasible even with non-informationally complete training sets (e.g., 72 random Pauli measurements).
- Generalization limits: Performance drops drastically when trained and tested on arbitrary six-qubit states without structure; overrepresentation of some state types can cause overfitting (Supplementary Notes).
- Larger qubit systems (10/20/50 qubits; Fig. 2): High average classical fidelities across J (Ising) with slight drops near the phase transition J=0. For XXZ, fidelities are lower in the gapless XY phase and drop near Δ=±1 (critical points), consistent with higher quantum fluctuations and sparser training coverage.
- Continuous-variable states (Table 2; average and worst-case classical fidelities against ideal noiseless distributions):
  • Squeezed thermal: noiseless avg≈0.9973 (worst≈0.9890); with prob. noise σ=0.05 avg≈0.9762 (worst≈0.9405); with θ-noise σ=0.05 avg≈0.9758 (worst≈0.9405)
  • Cat states: noiseless avg≈0.9964 (worst≈0.9870); with prob. noise avg≈0.9746 (worst≈0.9359); with θ-noise avg≈0.9658 (worst≈0.9077)
  • GKP states: noiseless avg≈0.9972 (worst≈0.9889); with prob. noise avg≈0.9264 (worst≈0.8387); with θ-noise avg≈0.9643 (worst≈0.9030)
  • Mixed families (i–iii): noiseless avg≈0.9827 (worst≈0.9512); with prob. noise avg≈0.9822 (worst≈0.9461)
- Online learning (Fig. 3): For cat states, average classical fidelity increases over 15 time steps as new homodyne data arrives, confirming effective online updates.
- Clustering and classification (Fig. 4): t-SNE embeddings of representation vectors reveal clear clustering by state type and phase; Gaussian mixture clustering matches types with 94.67% accuracy for six-qubit states and 100% for CV states. A supervised classifier on representations distinguishes pure vs mixed ferromagnetic Ising regimes with success rates of 100% (10 qubits), 100% (20 qubits), and 99% (50 qubits).
- Unsupervised per-state training (Table 3; cat states): After training on data from a single cat state, average/worst-case fidelities for query predictions remain high (e.g., |2,0⟩ cat: avg 0.9918 (s=50), worst 0.9614; avg 0.9912 (s=10), worst 0.9610).
Discussion
The study addresses whether a neural network can be trained offline on simulated data and later used to characterize new, structurally similar quantum states. Results show that GQNQ constructs effective, compact state representations from partial measurement data and accurately predicts outcomes for unperformed measurements across multiple structured families (spin chains, GHZ/W, CV states). High fidelities persist under finite statistics and moderate noise, and the method scales to larger qubit counts with predictable drops near critical points. The learned representations are versatile for downstream tasks, enabling unsupervised clustering of states (including phase separation) and supervised classification of physical regimes. Compared to conventional tomography, GQNQ avoids explicit density-matrix reconstruction and functions with encrypted measurement parametrizations, offering computational and privacy advantages. The method’s performance depends on structural regularity of the state family and similarity to the fiducial training set, aligning with the intended use-case of many-body ground states and other regular families.
Conclusion
GQNQ introduces a flexible neural architecture for quantum state learning that supports offline, multi-purpose training on simulated fiducial data and subsequent prediction of unperformed measurement statistics for new but structurally similar states. It provides compact, data-driven state representations that generalize across families, enable online updates, and support clustering and classification tasks. Extensive experiments on spin-chain ground states, GHZ/W, and continuous-variable states demonstrate high predictive fidelities under realistic noise and finite statistics. Future work includes developing criteria to predict which state families are effectively learnable by GQNQ, improving robustness across phase transitions and heterogeneous interaction patterns, and further integrating online unsupervised training protocols.
Limitations
- Generalization requires structural similarity between target states and the fiducial training family; performance drops on arbitrary, unstructured states.
- Sensitivity near quantum critical points and in phases with strong quantum fluctuations (e.g., XY phase) reduces fidelity.
- Potential overfitting when some state types are overrepresented in training.
- Performance degrades when interactions vary widely (e.g., mixed ferro/antiferromagnetic couplings), with less satisfactory learning.
- Although GQNQ can work with non-informationally complete training sets, reduced measurement diversity may limit accuracy.
- Requires a reasonable guess of the state family for effective offline training; otherwise, online training is needed.
Related Publications
Explore these studies to deepen your understanding of the subject.






