Medicine and Health

Efficient Detection of Stigmatizing Language in Electronic Health Records via In-Context Learning: A Comparative Analysis and Validation Study

H. Chen, M. Alfred, et al.

This groundbreaking study by Hongbo Chen, Myrtede Alfred, and Eldan Cohen delves into the effectiveness of in-context learning (ICL) for identifying stigmatizing language in Electronic Health Records. Remarkably, ICL surpassed traditional methods with superior performance despite utilizing less data, highlighting its potential in bias reduction within healthcare documentation.

00:00

~3 min • Beginner • English

Index

Abstract

Background: The presence of stigmatizing language within Electronic Health Records (EHRs) poses risks to patient care by perpetuating biases, disrupting therapeutic relationships, and diminishing treatment adherence. Prior work has largely relied on supervised machine learning, which requires resource‑intensive annotated datasets. In‑context learning (ICL) enables large language models (LLMs) to adapt to tasks based on instructions and examples, reducing dependence on labeled data. Objective: To investigate the efficacy of ICL for detecting stigmatizing language in EHRs under data‑scarce conditions. Methods: We analyzed 5,043 EHR sentences from MIMIC‑IV emergency department discharge summaries. ICL was compared against zero‑shot (textual entailment) and few‑shot (SetFit) approaches, and a fully supervised fine‑tuning approach. Four prompting strategies were tested for ICL: Generic, Chain of Thought (COT), Clue and Reasoning Prompting (CARP), and a novel Stigma Detection Guided Prompt. Fairness was evaluated using equality of performance across sex, age, and race via TPR, FPR, and F1 disparities. Results: In zero‑shot, the best ICL model (GEMMA‑2 with Stigma Detection Guided Prompt) achieved F1=0.858 (95% CI [0.854, 0.862]), outperforming the best textual entailment model (DEBERTA‑M, F1=0.723, 95% CI [0.718, 0.728]) (P<.001). In few‑shot, the best ICL model (LLAMA‑3 with the same prompt) exceeded SetFit by 21.2%, 21.4%, and 12.3% F1 with 4, 8, and 16 annotations per class, respectively (all P<.001). With only 32 labeled instances, best ICL reached F1=0.901 (95% CI [0.895, 0.907]), close to supervised RoBERTa F1=0.931 (95% CI [0.924, 0.938]) trained on 3,543 labeled instances. Supervised models showed larger fairness disparities (e.g., highest TPR disparities up to 0.051 by sex, 0.108 by age, 0.064 by race) than ICL, which remained below 0.016 across subgroups. Conclusions: ICL effectively detects stigmatizing language, outperforming popular zero‑ and few‑shot baselines and approaching fully supervised performance with orders of magnitude fewer labels. The new Stigma Detection Guided Prompt enhances ICL detection. ICL provides a data‑efficient and more equitable alternative for EHR stigma detection.

Publisher

JMIR Medical Informatics

Published On

Nov 20, 2024

Authors

Hongbo Chen, Myrtede Alfred, Eldan Cohen

DOI

https://doi.org/https://doi.org/10.2196/preprints.68955

Related Publications

Explore these studies to deepen your understanding of the subject.

Health and Fitness

Influence of social determinants of health in the evolution of the quality of life of older adults in Europe: A comparative analysis between men and women

R. Llorens-ortega, C. Bertran-noguer, et al.

Computer Science

A framework for the emergence and analysis of language in social learning agents

T. J. Wieczorek, T. Tchumatchenko, et al.

Linguistics and Languages

Processing Chinese formulaic sequences in sentence context: a comparative study of native and non-native speakers

K. Chen, L. Gu, et al.

The Arts

Monsters revisited: a comparative study of the use of humor in dramatizing benevolent monsters in *The Monsters under the Bed* and *The Boy Who Loved Monsters and the Girl Who Loved Peas*

H. M. Bayoumy

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny