logo
ResearchBunny Logo
Abstract
This paper investigates the presence of covert racism in language models, focusing on dialect prejudice against African American English (AAE). The authors demonstrate that language models exhibit more negative associations with AAE than any experimentally recorded human stereotypes about African Americans, despite showing more positive overt stereotypes about the group. This dialect prejudice leads to harmful consequences, such as assigning less prestigious jobs and increased likelihood of suggesting death penalty for AAE speakers in hypothetical scenarios. The study reveals that current bias mitigation techniques, like human feedback alignment, exacerbate this discrepancy by masking the underlying covert bias. The findings highlight the significant implications for the fair and safe use of language technology.
Publisher
Nature
Published On
Sep 05, 2024
Authors
Valentin Hofmann, Pratyusha Ria Kalluri, Dan Jurafsky, Sharese King
Tags
covert racism
language models
dialect prejudice
African American English
bias mitigation
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny