logo
ResearchBunny Logo
The limits of fair medical imaging AI in real-world generalization

Medicine and Health

The limits of fair medical imaging AI in real-world generalization

Y. Yang, H. Zhang, et al.

This study reveals the critical challenges of fairness in medical AI for disease classification across various imaging modalities, highlighting how demographic shortcuts lead to biased predictions. Conducted by Yuzhe Yang, Haoran Zhang, Judy W. Gichoya, Dina Katabi, and Marzyeh Ghassemi, the research uncovers that less demographic attribute encoding in models can yield better performance in diverse clinical settings, emphasizing best practices for equitable AI applications.

00:00
00:00
~3 min • Beginner • English
Abstract
As artificial intelligence (AI) rapidly approaches human-level performance in medical imaging, it is crucial that it does not exacerbate healthcare disparities. Prior work showed AI can infer demographics from chest X-rays, prompting concern that models may use demographic shortcuts and behave unfairly across subpopulations. This study investigates the extent to which medical AI encodes demographic information and how that affects fairness in both in-distribution (ID) and out-of-distribution (OOD) settings. We examine radiology, dermatology and ophthalmology, including six global chest X-ray datasets. We confirm that AI leverages demographic shortcuts in disease classification. Algorithmic mitigation can create ‘locally optimal’ models that reduce ID fairness gaps, but such optimality often fails under OOD shifts. Models with less demographic encoding tend to be more ‘globally optimal’, exhibiting better fairness when evaluated in new environments. We propose best practices for selecting models that maintain performance and fairness beyond initial training contexts, highlighting critical considerations for AI clinical deployments across populations and sites.
Publisher
Nature Medicine
Published On
Oct 14, 2024
Authors
Yuzhe Yang, Haoran Zhang, Judy W. Gichoya, Dina Katabi, Marzyeh Ghassemi
Tags
medical AI
fairness
disease classification
demographic shortcuts
imaging modalities
algorithmic correction
clinical deployments
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny