logo
ResearchBunny Logo
Reliability and accuracy of artificial intelligence ChatGPT in providing information on ophthalmic diseases and management to patients

Medicine and Health

Reliability and accuracy of artificial intelligence ChatGPT in providing information on ophthalmic diseases and management to patients

F. Cappellani, K. R. Card, et al.

This study evaluated the accuracy of ophthalmic information provided by ChatGPT version 3.5, revealing a mix of promising insights and potential dangers. Researchers Francesco Cappellani, Kevin R. Card, Carol L. Shields, Jose S. Pulido, and Julia A. Haller discovered that while 77.5% of responses were acceptable, a significant portion provided incomplete or harmful information, emphasizing the critical need for human oversight in medical contexts.

00:00
00:00
~3 min • Beginner • English
Abstract
PURPOSE: To assess the accuracy of ophthalmic information provided by an artificial intelligence chatbot (ChatGPT). METHODS: Five diseases from 8 subspecialties of Ophthalmology were assessed by ChatGPT version 3.5. Three questions were asked for each disease: what is x?; how is x diagnosed?; how is x treated? Responses were graded by comparing them to the American Academy of Ophthalmology (AAO) patient guidelines, using a scale from −3 (unvalidated and potentially harmful) to 2 (correct and complete). MAIN OUTCOMES: Accuracy of responses on a scale from −3 to 2. RESULTS: Of 120 questions, 93 (77.5%) scored ≥1; 27 (22.5%) scored ≤−1; among these, 9 (7.5%) scored −3. The overall median score was 2 for “What is x?”, 1.5 for “How is x diagnosed?”, and 1 for “How is x treated?”, with no significant differences by Kruskal-Wallis testing. CONCLUSIONS: Despite positive scores, ChatGPT alone still provides incomplete, incorrect, and potentially harmful information about common ophthalmic conditions. ChatGPT may be valuable for patient education, but it is not sufficient without close human medical supervision.
Publisher
Eye
Published On
Authors
Francesco Cappellani, Kevin R. Card, Carol L. Shields, Jose S. Pulido, Julia A. Haller
Tags
Ophthalmology
ChatGPT
Medical Accuracy
AI Evaluation
Human Supervision
Medical Information
Subspecialties
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny