
Computer Science
Quantum-inspired machine learning on high-energy physics data
T. Felser, M. Trenti, et al.
Discover how Timo Felser and his team harness quantum-inspired machine learning with tree tensor networks to revolutionize the classification of b-jets at the LHCb experiment. Their findings reveal comparable performance to deep neural networks, offering unique advantages in interpretability and adaptability for real-time applications. Don't miss out on this cutting-edge research!
~3 min • Beginner • English
Introduction
The study investigates whether quantum-inspired tensor network classifiers can effectively and explainably classify the charge of b-quark initiated jets (b vs. b̄) in LHCb data, comparing performance to a deep neural network and the classical muon tagging method. Motivated by emerging links between machine learning and quantum physics, and by the need for fast, accurate, and interpretable classifiers in high-energy physics triggers, the work uses a Tree Tensor Network (TTN) to access quantum-information metrics (entanglement entropy, correlations) for feature importance and model compression, aiming for real-time applicability at LHC scales.
Literature Review
Prior work has established deep neural networks as powerful tools across science and industry, and highlighted deep connections to quantum physics. Tensor Networks (TNs), developed for efficiently representing quantum many-body states, have shown promise in ML tasks with competitive performance on standard datasets. TNs provide access to information-theoretic quantities (e.g., entanglement entropy, correlations) that can inform model explainability and design. In jet physics at the LHC, a range of ML techniques have been applied to jet substructure, including identification of heavy-flavor jets and jet charge, with ATLAS and CMS deploying advanced ML for b-tagging. At LHCb, ML has been used for b- vs c-jet discrimination; however, ML for b-jet charge identification (b vs b̄) remains under-explored, with muon tagging previously providing the best published performance but limited efficiency.
Methodology
- Classifiers: A Tree Tensor Network (TTN) is used as the classifier weight tensor. Each event's features are mapped via a nonlinear feature map to a product state, and class scores are computed by overlaps with class-specific TTN states. Probabilities are normalized across classes. A Deep Neural Network (DNN) with three hidden layers of 96 nodes serves as a baseline.
- Feature map: Each input feature x_i is rescaled to [0,1] and encoded as [cos(π x_i), sin(π x_i)]^T; the overall sample is the Kronecker product over features.
- TTN details: The TTN comprises hierarchically connected local tensors with an auxiliary bond dimension χ controlling expressive power and parameter count. Quantum-inspired diagnostics (entanglement entropy via Schmidt decomposition, and pairwise correlation functions) are computed efficiently on the trained TTN.
- Feature selection (QuIPS): Quantum-Information Post-learning feature Selection ranks features by (i) bipartition entanglement entropy (information shared across TTN cuts) and (ii) pairwise correlations between feature sites. Low-entropy features and fully (anti-)correlated redundancies can be pruned to reduce model complexity without significant performance loss.
- Post-training compression (QIANO): Quantum-Information Adaptive Network Optimization truncates TTN bond dimensions post-training using SVD to minimize infidelity, trading representation power for speed without retraining. Target prediction latency can thus be achieved by adjusting χ.
- Dataset: Approximately 700k simulated LHCb events (open data) of b and b̄ jets at √s = 13 TeV. Jet selection: p_T > 20 GeV and η in [2.2, 4.2]. Simulation chain: PYTHIA 8 for pp and fragmentation, EvtGen for b-hadron decays, GEANT4 for detector response, reconstructed with LHCb framework.
- Input features (16 total): From the highest-p_T muon, kaon, pion, electron, and proton in the jet (five particle types), three observables each: relative transverse momentum to jet axis (p_T), electric charge q, and angular distance to jet axis (ΔR) (5×3=15). If a particle type is absent, its features are zero. The 16th feature is total jet charge Q = Σ(q_i p_T,i^2)/Σ p_T,i^2.
- Training/testing: 60% training, 40% test split. Hyperparameters for TTN and DNN optimized for best performance on training. After training, classifiers output P_b (probability of b). A symmetric uncertainty threshold Δ around 0.5 defines an "unknown" region to optimize tagging power ε_tag. Optimized thresholds: Δ_TTN = 0.40, Δ_DNN = 0.20.
- Metrics: Efficiency ε_eff (fraction of decided jets), accuracy α (correct fraction among decided), and tagging power ε_tag = ε_eff (2α − 1)^2. ROC and AUC are also reported. Bias checks on physical quantities are performed to validate physics consistency.
Key Findings
- Overall performance: TTN and DNN achieve similar accuracies and efficiencies on the test set: TTN decides in η_TTN = 54.5% of cases with accuracy α_TTN = 70.56%; DNN decides in η_DNN = 55.3% with accuracy α_DNN = 70.49%.
- ROC/AUC: ROC curves for TTN and DNN are nearly identical; AUC_TTN = 0.689, AUC_DNN = 0.690. Outputs are highly correlated (Pearson r = 0.97).
- Tagging power vs p_T: Both TTN and DNN deliver tagging power significantly above the classical muon tagging method across the full jet p_T range.
- Confidence distributions: DNN produces a more Gaussian-like confidence distribution with fewer extreme confidences; TTN shows a flatter distribution with pronounced peaks near 0 and 1, strongly exploiting muon presence and charge when available.
- Feature insights (QuIPS): Correlation analysis shows p_T and ΔR are generally correlated across particles except for the kaon, which provides more independent information. Entropy analysis identifies the most informative features as total jet charge Q and kaon-related features (notably momentum magnitude and ΔR). Selecting the best 8 features (muon: q, p_T, ΔR; kaon: q, p_T, ΔR; pion q; total jet charge Q) yields performance comparable to using all 16, with less than 1% loss in accuracy. Using the 8 least-informative features drops accuracy to about 52% (near random) and tagging power below muon tagging.
- Speed/complexity trade-offs (QIANO): Post-training TTN truncation reduces prediction time substantially with minimal accuracy loss. Example (Model M16, all 16 features): reducing χ from 200 to 5 lowers average prediction time from 345 µs to 37 µs with only ~0.03% accuracy drop. For the QuIPS-reduced model (B8), prediction time is ~19 µs with accuracy ~69% and very few parameters (e.g., χ=16 yields 264 parameters). Training time for B8 is ~4.7× faster and prediction ~5.5× faster than M16 in the reported run.
- Real-time feasibility: The compressed TTN achieves latencies compatible with LHCb high-level trigger timescales, with further prospective speedups (10–100×) via GPU parallelization of tensor contractions.
Discussion
The study demonstrates that a quantum-inspired TTN can match the predictive performance of a strong DNN baseline for b-jet charge classification while providing richer interpretability and flexible post-training optimization. TTN-accessible entanglement entropy and correlation functions elucidate which features carry discriminative information, enabling principled feature pruning (QuIPS) that maintains accuracy while reducing complexity and latency. TTN’s post-training bond-dimension truncation (QIANO) allows adaptive optimization to hardware or latency constraints without retraining, a practical advantage over standard DNN workflows. Differences in confidence calibration indicate that TTN leverages certain physically motivated signals (e.g., muon charge) more decisively, while the DNN distributes confidence more conservatively. Collectively, these properties address the need for explainable, efficient, and real-time-capable classifiers in HEP triggers, improving over classical muon tagging and offering a path to deployment under stringent latency requirements.
Conclusion
The work introduces and validates a quantum-inspired TTN approach for classifying b vs b̄ jets in LHCb data, achieving performance comparable to a tuned DNN and significantly surpassing classical muon tagging. Unique TTN capabilities—measurement of entanglement entropy and correlations—enable explainable feature importance (QuIPS), effective model simplification, and post-training compression (QIANO) that preserves accuracy while dramatically reducing inference time. These advances support feasibility for real-time HEP applications. Future directions include enhancing TTN optimization (e.g., stochastic gradient descent, Riemannian methods), accelerating inference via GPU/FPGA implementations to reach MHz rates, and extending the approach to broader HEP tasks such as discriminating b-, c-, and light-flavor jets, and searches involving heavy-flavor final states.
Limitations
- The dataset is simulation-based and tailored to LHCb conditions; external generalization to other detectors or real data requires validation. The analysis is independent and not reviewed by the LHCb collaboration.
- The TTN optimization used conjugate gradient descent; more advanced optimizers may improve results. Hyperparameter choices (e.g., bond-dimension χ) affect performance and risk overfitting.
- The TTN code is not publicly available; reproducibility for TTN is limited compared to the DNN code (available on request).
- Reported latency measurements are CPU-based; achieving trigger-level latencies in deployment will likely require specialized hardware (e.g., GPUs/FPGAs) and engineering effort.
- The muon tagging baseline has inherently low efficiency; comparisons reflect distinct operational regimes and may not capture all practical constraints in live trigger systems.
Related Publications
Explore these studies to deepen your understanding of the subject.