Linguistics and Languages

AI generates covertly racist decisions about people based on their dialect

V. Hofmann, P. R. Kalluri, et al.

This groundbreaking research by Valentin Hofmann, Pratyusha Ria Kalluri, Dan Jurafsky, and Sharese King delves into the hidden biases present in language models, specifically targeting dialect prejudice against African American English (AAE). The findings unveil how these models perpetuate negative associations that not only challenge existing stereotypes but lead to serious real-world consequences.

00:00

Playback language: English

Index

Introduction

Language models (LMs), a type of artificial intelligence, are increasingly used in diverse applications, from education and healthcare to hiring decisions. However, concerns exist regarding their potential to perpetuate harmful biases present in their training data. While previous research has focused on overt racism in LMs, this study explores the presence of covert racism, a more subtle form of bias that has emerged in the United States after the Civil Rights Movement. The research question is whether this covert racism is present in LMs and how it manifests. The study's importance stems from the widespread use of LMs in consequential decision-making processes, where bias can lead to significant societal harm. The researchers aim to determine if LMs exhibit dialect prejudice against African American English (AAE), a dialect associated with negative stereotypes, and to assess the impact of this prejudice on LM decisions.

Literature Review

Existing AI research has documented bias against racialized groups, mainly focusing on overt racism where racial groups are explicitly named and linked to stereotypes. However, social scientists argue that a 'new' racism has emerged, characterized by subtle biases and color-blind ideologies. This covert racism avoids overt racial terminology while maintaining negative beliefs about racialized groups through coded language and practices. This study builds upon this understanding by investigating whether LMs perpetuate this covert racism through dialect prejudice, specifically focusing on AAE. Previous research has shown that AAE speakers face discrimination in various contexts, including employment, housing, and the legal system. The study uses this existing understanding of human bias against AAE to contextualize its findings on LM bias.

Methodology

The researchers employed a novel method called "matched stereotype probing," inspired by the matched guise technique from sociolinguistics. This technique involves comparing LMs' responses to texts written in both Standard American English (SAE) and AAE, while controlling for meaning. The study used two settings: one with meaning-matched texts and another with non-meaning-matched texts. This allowed investigation of both direct and indirect associations between dialect and negative stereotypes. Twelve different language models, including GPT-2, RoBERTa, T5, GPT-3.5, and GPT-4, were evaluated across multiple versions. The prompts used embedded SAE and AAE texts and requested the models to predict properties of the speakers. The predictions were then compared to identify biases against AAE speakers. The study also investigated the impact of LM size and human feedback (HF) training on the observed biases. The researchers analyzed the models’ associations between AAE and various occupational categories and legal outcomes, including job prestige and sentencing in a hypothetical criminal case, to assess the real-world implications of these biases.

Key Findings

The study found that language models display a significant level of covert racism in the form of dialect prejudice against AAE. The models exhibited more negative stereotypes about AAE speakers than any experimentally recorded human stereotypes about African Americans, despite showing generally more positive overt stereotypes about African Americans. This disparity between overt and covert biases was particularly pronounced in models trained with HF. The matched stereotype probing revealed that LMs associated AAE speakers with less prestigious jobs, higher likelihood of conviction, and a greater chance of a death sentence in hypothetical criminal cases. The analysis of individual linguistic features of AAE demonstrated that the models' negative associations are linked to specific grammatical features of the dialect. Furthermore, increasing model size did not mitigate the covert bias, and in some cases, exacerbated it. HF training, while reducing overt bias, actually masked the underlying covert prejudice. The findings show that both larger models and models trained with HF exhibit a strong counter-bias, yet over time they have not significantly mitigated the covert prejudice.

Discussion

The findings address the research question by demonstrating the existence of covert racism in LMs manifested as dialect prejudice against AAE. This bias has significant implications, as it can lead to unfair and discriminatory outcomes in various real-world applications of LMs. The significance of the results lies in revealing a previously unrecognized form of bias in LMs, one that is not addressed by current bias mitigation techniques. The results are relevant to the fields of AI ethics, natural language processing, and social justice, highlighting the need for more sophisticated methods to detect and mitigate covert biases in AI systems. The study’s findings underscore the complex and potentially harmful interplay between overt and covert racism in society, which is reflected in the LMs’ biases.

Conclusion

This study reveals the concerning presence of covert racism in language models, manifested as significant dialect prejudice against AAE. This bias leads to harmful allocational harms, including assigning low-prestige jobs and increased likelihood of negative legal outcomes. Current bias mitigation techniques are insufficient to address this covert bias. Future research should focus on developing methods to detect and mitigate this form of bias and investigate the sources of this bias in training data. Addressing this issue is crucial for ensuring the fair and equitable use of language technology.

Limitations

The study's reliance on hypothetical scenarios for evaluating legal and employment decisions might not perfectly represent real-world outcomes. The focus on AAE and SAE might not generalize perfectly to other dialects or languages. Future studies could expand on these by including diverse dialects and contexts. The analysis of specific linguistic features contributing to bias could be further refined. The study primarily focuses on English language models and may not generalize to LMs trained on other languages. This requires further investigation in other linguistic contexts.

Related Publications

Explore these studies to deepen your understanding of the subject.

Transportation

High-speed rail new towns and their impacts on urban sustainable development: a spatial analysis based on satellite remote sensing data

S. Zou, X. Fan, et al.

Economics

Fragmented property rights and their risks on foreclosed housing: a qualitative comparative analysis based on judicial auctions in China

X. Qian

Computer Science

Exploring the mechanism of sustained consumer trust in AI chatbots after service failures: a perspective based on attribution and CASA theories

C. Gu, Y. Zhang, et al.

Education

Profiles of learners based on their cognitive and metacognitive learning strategy use: occurrence and relations with gender, intrinsic motivation, and perceived autonomy support

D. Kwarikunda, U. Schiefele, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny