logo
ResearchBunny Logo
Hallucination Detection in LLMs Using Spectral Features of Attention Maps

Computer Science

Hallucination Detection in LLMs Using Spectral Features of Attention Maps

J. Binkowski, D. Janiak, et al.

Discover LapEigvals — a new spectral approach that treats attention maps as graph adjacency matrices and uses top-k Laplacian eigenvalues to detect LLM hallucinations. Experiments show state-of-the-art performance among attention-based methods, with strong robustness and generalization. This research was conducted by the authors present in the <Authors> tag.... show more
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance across various tasks but remain prone to hallucinations. Detecting hallucinations is essential for safety-critical applications, and recent methods leverage attention map properties to this end, though their effectiveness remains limited. In this work, we investigate the spectral features of attention maps by interpreting them as adjacency matrices of graph structures. We propose the LapEigvals method, which utilizes the top-k eigenvalues of the Laplacian matrix derived from the attention maps as an input to hallucination detection probes. Empirical evaluations demonstrate that our approach achieves state-of-the-art hallucination detection performance among attention-based methods. Extensive ablation studies further highlight the robustness and generalization of LapEigvals, paving the way for future advancements in the hallucination detection domain. Recent studies have shown that hallucinations can be detected using internal states of the model, e.g., hidden states (Chen et al., 2024) or attention maps (Chuang et al., 2024a), and that LLMs can internally "know when they do not know" (Azaria and Mitchell, 2023; Orgad et al., 2025). We show that spectral features of attention maps coincide with hallucinations and, building on this observation, propose a novel method for their detection. As highlighted by (Barbero et al., 2024), attention maps can be viewed as weighted adjacency matrices of graphs. Building on this perspective, we performed statistical analysis and demonstrated that the eigenvalues of a Laplacian matrix derived from attention maps serve as good predictors of hallucinations. We propose the LapEigvals method, which utilizes the top-k eigenvalues of the Laplacian as input features of a probing model to detect hallucinations. We share full implementation in a public repository: https://github.com/graphml-lab-pwr/lapeigvals. We summarize our contributions as follows: 1. We perform statistical analysis of the Laplacian matrix derived from attention maps and show that it could serve as a better predictor of hallucinations compared to the previous method relying on the log-determinant of the maps. 2. Building on that analysis and advancements in the graph-processing domain, we propose leveraging the top-k eigenvalues of the Laplacian matrix as features for hallucination detection probes and empirically show that it achieves state-of-the-art performance among attention-based approaches. 3. Through extensive ablation studies, we demonstrate properties, robustness and generalization of LapEigvals and suggest promising directions for further development.
Publisher
Published On
Oct 18, 2025
Authors
Jakub Binkowski, Denis Janiak, Albert Sawczyn, Bogdan Gabrys, Tomasz Kajdanowicz
Tags
Laplacian eigenvalues
attention maps
hallucination detection
spectral features
graph-based analysis
probing models
LLM safety
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny