logo
ResearchBunny Logo
Detecting hallucinations in large language models using semantic entropy

Computer Science

Detecting hallucinations in large language models using semantic entropy

S. Farquhar, J. Kossen, et al.

Discover how researchers Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal are tackling the reliability of large language models with an innovative entropy-based method. This approach enables users to identify confabulations—incorrect outputs—without needing prior task knowledge, paving the way for safer applications in various fields.

00:00
00:00
~3 min • Beginner • English
Abstract
Large language model (LLM) systems, such as ChatGPT or Gemini, can show impressive reasoning and question-answering capabilities but often ‘hallucinate’ false outputs and unsubstantiated answers. Answering unreliably or without the necessary information prevents adoption in diverse fields, with problems including fabrication of legal precedents or untrue facts in news articles and even posing a risk to human life in medical domains such as radiology. Encouraging truthfulness through supervision or reinforcement has been only partially successful. Researchers need a general method for detecting hallucinations in LLMs that works even with new and unseen questions to which humans might not know the answer. Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect generations. Our method addresses the fact that one idea can be expressed in many ways by computing uncertainty at the level of meaning rather than specific sequences of words. Our method works across datasets and tasks without a priori knowledge of the task, requires no task-specific data and robustly generalizes to new tasks not seen before. By detecting when a prompt is likely to produce a confabulation, our method helps users understand when they must take extra care with LLMs and opens up new possibilities for using LLMs that are otherwise prevented by their unreliability.
Publisher
Nature
Published On
Jun 20, 2024
Authors
Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, Yarin Gal
Tags
large language models
confabulations
entropy-based uncertainty
reliability
semantic meaning
hallucinations
user assessment
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny