logo
ResearchBunny Logo
Introduction
Characterizing quantum states prepared by modern quantum simulators is increasingly challenging due to the exponential scaling of resources required by traditional quantum state tomography (QST) methods. Standard techniques like MLE suffer from the curse of dimensionality, requiring exponential experimental data and intractable classical post-processing. Desirable properties for QST schemes include sub-exponential scaling in experimental data and classical post-processing, observable universality (faithful reconstruction of any observable), and state universality (indifference to the target state). Existing methods often compromise on one or more of these properties. Variational approaches, such as matrix-product state tomography and compressed sensing, restrict the state space for improved efficiency, but lack universality. Neural network quantum state tomography (NN-QST) offers a promising alternative by leveraging the universal approximation capabilities of neural networks. However, the choice of variational ansatz and understanding its limitations remain significant challenges. This research explores the use of CNNs as a variational ansatz for NN-QST, motivated by their efficient encoding of volume-law entanglement.
Literature Review
Recent work has explored the use of neural networks for QST, employing recurrent neural networks (RNNs) and attention-based models. These approaches have shown promise, but applications to experimental systems have been limited to small numbers of qubits due to unclear performance advantages compared to standard methods. This paper addresses this gap by quantitatively comparing a CNN-based NN-QST scheme to standard methods such as MLE, examining its performance in various scenarios.
Methodology
The proposed method encodes the quantum state via a probability distribution over measurement outcomes of an informationally complete positive operator-valued measure (POVM). This distribution is approximated by a CNN. The Pauli-4 POVM is employed, comprising four operators derived from single-qubit Pauli measurements. The variational parameters of the CNN are optimized using the ADAM optimizer to maximize the likelihood of reproducing the experimental data. Two CNN architectures are considered: a standard CNN and an autoregressive CNN (ARCNN). The standard CNN uses circular or open boundary conditions depending on target state symmetries and employs a product or dense output layer for normalization. The ARCNN, leveraging the autoregressive property, allows for exact normalization and sampling without the need for Markov Chains. The expressivity of both architectures is analyzed, demonstrating a maximum distance beyond which correlations are not captured, dictated by network depth and kernel size. The method is benchmarked against MLE and direct observable estimation from experimental data using both classical infidelity and root mean square (RMS) error of observables as metrics. Synthetic data is used to enable a fair comparison.
Key Findings
The study benchmarks the CNN-based NN-QST against MLE and direct sampling for various quantum systems. For small systems where MLE is feasible, the NN-QST shows a 2-5 fold reduction in infidelity compared to MLE, especially for smaller datasets. For larger systems where MLE is intractable, the CNN-based method significantly reduces the RMS error of local observables, especially for small datasets and states closer to product states. In the analysis of a 16-site long-range interacting ion chain with added dephasing noise, using an ARCNN, the method demonstrates the ability to estimate higher-order correlation functions than direct sampling from the dataset. Local MLE, applied to both the raw data and the network-generated data shows that the ARCNN performs on par with MLE with greatly reduced computational complexity. The study also examines steady states of a driven dissipative 2D system. The CNN successfully captures the dissipative phase transition, exhibiting lower RMS errors compared to direct sampling, though the advantage diminishes for observables involving large sums of similar correlators due to the inherent bias in the variational approach. The network's performance is evaluated based on its ability to reconstruct the target POVM distribution accurately, showing a strong advantage over traditional methods even in the presence of noise.
Discussion
The results demonstrate the efficiency and accuracy of the proposed CNN-based NN-QST method across a range of systems and scenarios. The method's ability to outperform MLE for smaller datasets and reduce the error in observable estimations for larger systems highlights its practical significance. The use of the ARCNN offers advantages in terms of exact normalization, sampling, and ease of training. The observed bias associated with the variational approach suggests the importance of validation and careful consideration of the target observable when interpreting results. The method's success in capturing complex features like phase transitions in driven-dissipative systems further establishes its potential for application to experimental settings. The observed scaling of computational cost also suggests a potential avenue for scalability.
Conclusion
This research presents a novel and efficient quantum state tomography scheme based on convolutional neural networks. Quantitative benchmarks against standard techniques demonstrate significant advantages in terms of fidelity, error reduction, and computational cost across diverse quantum systems. The autoregressive CNN architecture offers particular benefits. Future research should explore the application of this method to more complex systems and experimental data, and also compare its performance against alternative techniques like shadow tomography. Further investigation into advanced network architectures and training methods holds the potential to enhance the accuracy and efficiency of NN-QST further.
Limitations
The study primarily employs synthetic data, limiting the direct applicability to real-world experimental scenarios. The variational nature of the approach introduces a bias, the impact of which can vary depending on the observable and system. The extension of the autoregressive approach to higher dimensional systems remains a subject for future research.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny