Computer Science
Measuring magic on a quantum processor
S. F. E. Oliveira, L. Leone, et al.
The paper addresses how to accurately and robustly measure “magic,” here identified with the stabilizer Rényi entropy, on noisy intermediate-scale quantum (NISQ) hardware. While state preparation, Clifford gates, and computational-basis measurements can be made fault tolerant, circuits restricted to Clifford resources are efficiently classically simulable; quantum advantage requires non-Clifford resources (magic states). These resources are powerful but fragile, and their amount should be measurable and calibrated for practical computation. Decoherence is not magic-preserving and can increase or decrease magic; inaccurate Clifford gates can inadvertently create magic, signaling noise. Prior magic measures often involve extremization and lacked experimental schemes. The authors propose a practical protocol using randomized measurements to estimate a magic measure (stabilizer Rényi entropy), with resources scaling favorably compared to full tomography (O(d^2) vs O(d^4)) and relying on local Clifford operations.
Background establishes that Clifford-only computation is classically simulable and that non-Clifford resources (e.g., T states and non-Gaussian resources) enable quantum advantage. A recent theoretical framework defines stabilizer Rényi entropy as a magic measure, and prior work considered magic state distillation, constraints and quantification of magic, and simulation bounds. However, experimental methods to measure magic directly were lacking or relied on global randomized measurements or extremization. The present work builds on randomized measurement techniques and the Clifford group forming a unitary 2-design to enable local, scalable estimation of stabilizer-based magic.
Magic measure definition: the authors consider a stabilizer Rényi-entropy-based magic measure M_e that depends on two ingredients: a stabilizer purity-like quantity W(ρ) and the 2-Rényi entropy S(ρ). They propose estimating both via randomized measurements using only local single-qubit Clifford operations, exploiting that the Clifford group is a 2-design. Protocol (idealized steps): (i) sample N_U random local Clifford operators C = ⊗_j C_j from the single-qubit Clifford group; (ii) prepare the target state ρ via a circuit U (which can include non-Clifford gates such as T-gates); (iii) apply C to the state; (iv) measure in the computational basis; (v) repeat measurements N_M times per C to estimate occupation probabilities P(s|C); (vi) repeat over the ensemble of C to estimate expectation values. From the estimated probabilities, compute purity P(ρ) = Tr(ρ^2) and stabilizer purity W(ρ) via linear estimators that combine measured probabilities with predetermined weighting coefficients derived from diagonal operators O2 and O4 whose local Clifford averages reproduce the desired invariants. Because the Clifford group forms a 2-design, the same measurement data supports estimating both quantities. Experimental setup: The protocol is run on IBM Quantum Falcon processors (e.g., ibmq_lima, ibmq_quito; additional runs referenced on ibmq_casablanca) for single- and multi-qubit test states. State families are prepared by layers: initialize |0⟩^⊗n, apply H gates to obtain |+⟩^⊗n, inject magic via T gates in one or two layers, and entangle via CX layers between the T layers, producing states parameterized by the total number of T gates t (1 to 2n−1). Measurement resources: Statistical analysis shows unbiased estimators; total resources scale as N_U×N_M = O(ε^−2) to estimate stabilizer purity within error ε, with variance bounds derived under finite sampling of local Cliffords and finite shots per unitary. Compared to full tomography requiring O(d^4) expectations, this approach uses O(d^2) resources for a given precision. Noise modeling: Deviations between experimentally measured and theoretically predicted magic are attributed to (i) decoherence modeled by a non-unitary channel mixing with |0⟩⟨0| with probability p, and (ii) imperfections in nominally Clifford gates modeled as small unitary phase/displacement errors parameterized by ε. Purity measurements can isolate certain error contributions. By tuning noise parameters, the model fits measured magic across circuits with varying t.
- Experimental validation: On single-qubit magic states, experimental stabilizer 2-Rényi entropy and purity closely match theoretical predictions, indicating low decoherence and accurate control for simple circuits. - Multi-qubit behavior (n=3,4,5): As n and circuit depth increase, purity degrades due to decoherence, and measured magic often exceeds theoretical values for the ideal target states. This counterintuitive increase indicates that imperfectly implemented Clifford gates inject additional (unwanted) magic, with stronger effects in larger and deeper circuits and particularly noticeable for low-magic target states. - Noise characterization: Measured stabilizer 2-Rényi entropy across t provides data to fit a noise model with decoherence (state mixing) and gate imperfection (phase-like unitary error), yielding gray-square model points that better match experimental data as circuit depth increases (Figs. 4–6). - Robustness at high magic: Analytical considerations (Supplementary Note 2) indicate high-magic states can be more robust to low-entropy noise distributions than low-magic states, consistent with experimental trends. - Resource scaling: The local-Clifford randomized protocol enables simultaneous estimation of purity and stabilizer purity with total measurements N_U×N_M = O(ε^−2) and overall resource usage O(d^2) for a fixed-precision estimate, providing a practical alternative to tomography (O(d^4)).
Measuring magic is essential because no quantum advantage is possible without non-Clifford resources. The proposed local-Clifford randomized measurement protocol provides a practical path to quantify magic on real devices, serving as a hardware benchmark for the capacity to generate non-stabilizer states. The observation that imperfect Cliffords can inject magic reframes discrepancies as diagnostic signals of noise and control errors, enabling construction and calibration of realistic noise models. The results suggest that high-magic regimes can exhibit relative robustness to certain low-entropy noise channels, while low-magic states are more susceptible to spurious magic injection from control imperfections. Overall, the method links resource-theoretic quantities (stabilizer Rényi entropy) to experimentally accessible randomized measurements, facilitating evaluation and improvement of NISQ hardware for tasks requiring magic.
The paper introduces and experimentally demonstrates a randomized-measurement protocol using only local single-qubit Clifford operations to estimate stabilizer Rényi entropy (magic) and purity, enabling simultaneous extraction from the same data due to the Clifford 2-design property. Experiments on IBM Quantum devices validate the approach on single- and multi-qubit states, reveal that measured magic can exceed theoretical targets due to imperfect Clifford implementations, and show how these measurements can parametrize realistic noise models. This provides a practical tool to assess and improve hardware’s ability to generate non-stabilizer resources and to diagnose noise sources. Future directions include scaling to larger systems, refining estimators and variance bounds, integrating error mitigation tailored to magic estimation, and applying the protocol to benchmark and optimize magic-centric subroutines such as magic state distillation.
- Experimental sections show device- and depth-dependent decoherence that reduces purity and affects magic estimates. - Imperfect implementation of Clifford gates can inject unwanted magic, complicating interpretation without a noise model. - Statistical accuracy: stabilizer purity lies in a bounded range that shrinks with system size; achieving ε small enough for faithful estimates requires resources that can grow rapidly with qubit number, implying practical scalability limits. - The reported noise models are phenomenological and require device-specific calibration; model assumptions (e.g., low-entropy noise, simple mixing channel, small unitary errors) may not capture all error mechanisms. - Results reference multiple IBM processors with varying performance; cross-device variability impacts generalizability.
Related Publications
Explore these studies to deepen your understanding of the subject.

