logo
ResearchBunny Logo
Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks

Medicine and Health

Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks

S. Sandmann, S. Riepenhausen, et al.

This groundbreaking study by Sarah Sandmann, Sarah Riepenhausen, Lucas Plagwitz, and Julian Varghese explores the capabilities of advanced AI models like GPT-3.5, GPT-4, and Llama 2 in clinical decision support. Discover how GPT-4 outperformed its peers, suggesting effective diagnoses and treatments, while pointing to the crucial need for regulated AI in healthcare!

00:00
00:00
~3 min • Beginner • English
Abstract
It is likely that individuals are turning to Large Language Models (LLMs) to seek health advice, much like searching for diagnoses on Google. We evaluate clinical accuracy of GPT-3.5 and GPT-4 for suggesting initial diagnosis, examination steps and treatment of 110 medical cases across diverse clinical disciplines. Moreover, two model configurations of the Llama 2 open source LLMs are assessed in a sub-study. For benchmarking the diagnostic task, we conduct a naïve Google search for comparison. Overall, GPT-4 performed best with superior performances over GPT-3.5 considering diagnosis and examination and better performance on frequent vs rare diseases is evident for all three approaches. The sub-study indicates slightly lower performances for Llama models. In conclusion, the commercial LLMs show promising potential for medical question answering in two successive major leases. However, some weaknesses underscore the need for robust and regulated AI models in health care. Open source LLMs can be a viable option to address specific needs regarding data privacy and transparency of training.
Publisher
Nature Communications
Published On
Mar 06, 2024
Authors
Sarah Sandmann, Sarah Riepenhausen, Lucas Plagwitz, Julian Varghese
Tags
AI in healthcare
clinical decision support
GPT-3.5
GPT-4
Llama 2
diagnosis
data privacy
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny