Education

Inside the Black Box: Detecting and Mitigating Algorithmic Bias across Racialized Groups in College Student-Success Prediction

H. Anahideh, M. P. Ison, et al.

This study by Hadis Anahideh, Matthew P Ison, Anuja Tayal, and Denisa Gándara delves into the profound bias present in college student success prediction models, revealing significant racial disparities, particularly against Black and Hispanic students. The researchers explore multiple machine learning approaches and bias mitigation techniques, ultimately highlighting the challenges in achieving fairness in educational predictions.

00:00

Playback language: English

Index

Introduction

The increasing use of predictive analytics in higher education raises concerns about the potential for algorithmic bias to perpetuate existing social inequalities. These algorithms, often used in admissions, resource allocation, and student intervention programs, rely on historical data which may reflect societal biases like racism and sexism. This study focuses on the prediction of bachelor's degree attainment, a crucial measure of college success, and examines how common machine learning models might unfairly disadvantage racial minority students. The researchers highlight the importance of understanding the historical and social contexts that shape educational data, recognizing systemic factors such as educational tracking, school segregation, teacher bias, and disparities in school funding that disproportionately affect minoritized students. The central research questions are: 1) To what extent are college student success predictions biased across racial/ethnic groups? and 2) How effective are computational strategies in mitigating racial/ethnic bias?

Literature Review

Existing research on algorithmic fairness in education has revealed bias in predictive models, particularly when using institutional administrative data. Studies have shown that models may unfairly advantage White students and disadvantage minority students. The source of data (e.g., learning management systems, institutional data, or surveys) also impacts the fairness of predictions, with institutional data often showing more bias against disadvantaged groups. While some studies examined the fairness trade-offs between different algorithms and fairness notions, others focused on predicting course success rather than broader measures of college success like degree attainment. This study builds on previous work by using a nationally representative dataset (Education Longitudinal Study of 2002) to provide a more comprehensive analysis of bias in predicting a broader college success outcome (bachelor's degree attainment) and by testing various bias-mitigation techniques.

Methodology

The study uses data from the Education Longitudinal Study of 2002 (ELS), focusing on students who attended four-year institutions. The outcome variable is binary: whether a student obtained at least a bachelor's degree. Predictive variables include demographic characteristics, socioeconomic factors, grades, college preparation, and school experiences. Four common machine learning models are used: Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM). Hyperparameter tuning was performed using five-fold cross-validation. The dataset was split into training and testing sets (80/20 split) with stratification on both outcome and racial categories to ensure representative samples in both sets. Fairness was evaluated using four metrics: statistical parity, equal opportunity, predictive equality, and equalized odds. Bias mitigation techniques included preprocessing methods (Reweighting and Disparate Impact Remover) and in-processing methods (Exponentiated Gradient Descent and Meta Fair Classifier). Fairness was assessed both at the subgroup level (each racial group versus the rest) and at the aggregate level (privileged vs. unprivileged groups). The analysis was repeated across 30 different random data splits. Results with and without multiple imputation are presented, showing that imputation minimally impacts results except for SVM.

Key Findings

The study found no significant differences in the overall accuracy of the four ML models (around 78%). However, substantial differences in fairness exist across models and fairness metrics. Black and Hispanic students were consistently disadvantaged across all models and fairness metrics. For example, in Statistical Parity, Black and Hispanic students had significantly lower probabilities of predicted attainment than White and Asian students. Predictive Equality showed higher false positive rates for White and Asian students and very low or zero predictions for Black students, and Equal Opportunity demonstrated higher false negative rates for Black and Hispanic students. Analysis at the aggregate level (combining White/Asian as privileged and Black/Hispanic/Multiple Races as unprivileged) masked some of the subgroup-level variations. Bias mitigation techniques generally failed to effectively reduce bias across all subgroups. Reweighting proved ineffective. DIR reduced bias for Black students but increased it for Hispanic students. Exponentiated Gradient Reduction had minimal impact, and Meta Fair Classifier improved fairness for Hispanic students but worsened it for Black students in some cases. The study highlighted the complexities of bias mitigation and the need for more sophisticated techniques.

Discussion

The findings highlight the pervasive nature of algorithmic bias in college student success prediction, even with nationally representative data. The ineffectiveness of common bias-mitigation techniques suggests that simply adjusting for underrepresentation of successful minority students is insufficient. Deeper, systemic biases embedded in the data need to be addressed. The results underscore the importance of disaggregating data at the subgroup level to avoid masking inequalities. Furthermore, the researchers discuss the implications of these biases for college admissions and student support programs. They emphasize the need for caution in interpreting predictive models to avoid creating deficit narratives about minoritized students. The study highlights the inherent challenges in achieving fairness and the need for a more nuanced understanding of the trade-offs between accuracy and fairness.

Conclusion

This study demonstrates the significant bias present in commonly used college student success prediction models and the limitations of current bias-mitigation techniques. Future research should focus on developing more effective methods for addressing systemic biases and on a deeper exploration of feature selection and its impact on fairness. Further disaggregation of racial categories is needed for a more thorough understanding of bias. The findings emphasize the importance of careful consideration of fairness and equity when developing and deploying predictive models in higher education.

Limitations

The study is limited by the available variables in the ELS dataset. While nationally representative, the dataset might not capture all relevant factors influencing college success. The choice of specific fairness metrics and bias-mitigation techniques also influences the results. Further research could explore alternative approaches and a wider range of techniques. Additionally, the study primarily focuses on racial bias, neglecting other forms of potential bias (e.g., gender, socioeconomic status).

Related Publications

Explore these studies to deepen your understanding of the subject.

Linguistics and Languages

Detecting directional forces in the evolution of grammar: A case study of the English perfect with intransitives across EEBO, COHA, and Google Books

S. Okuda, M. Hosaka, et al.

Political Science

Measuring competition between the great powers across Africa and Asia using a measure of relative dispersion in media coverage bias

E. Gooch, S. Goethe, et al.

Environmental Studies and Forestry

Equatorward shift of the boreal summer intertropical convergence zone in Maritime Continent and the impacts on surface black carbon concentration and public health

T. Huang, Y. Gu, et al.

Medicine and Health

Prediction of childhood overweight and obesity at age 10–11: findings from the Studying Lifecourse Obesity Predictors and the Born in Bradford cohorts

N. Ziauddeen, P. J. Roderick, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny