Education
Inside the Black Box: Detecting and Mitigating Algorithmic Bias across Racialized Groups in College Student-Success Prediction
H. Anahideh, M. P. Ison, et al.
The expansion of predictive analytics in education has raised concerns about perpetuating social disparities when models reflect historical injustices embedded in data. Predictive models mapping student attributes to outcomes can unfairly predict less favorable outcomes for racially minoritized students, reflecting racism, sexism, and classism in society. This study focuses on disparities in college-student success predictions across racial/ethnic groups, given persistent inequities in attainment. Research questions: (1) To what extent are college student-success predictions biased across racial/ethnic groups? (2) How effective are computational strategies in mitigating racial/ethnic bias? The study situates analyses within historical and social contexts, recognizing systemic disadvantages for racially minoritized students that shape data distributions and, consequently, model predictions.
Prior work in education and learning analytics highlights fairness concerns across the ML pipeline, emphasizing representation of socially relevant groups and development of fairness metrics and mitigation strategies. Studies using institutional data often detect biases advantaging White students and disadvantaging Hispanic/Latinx and male students. Gardner et al. (2019) found fairness varied by algorithm, features, course, and gender imbalance, with no clear fairness–accuracy tradeoff. Yu et al. (2020) showed institutional data are more biased against disadvantaged groups than LMS or survey data. Most prior work focuses on course-level outcomes, single institutions, or non-representative data. This study extends the literature by using a nationally representative dataset, modeling attainment (a broader student-success outcome more relevant to admissions and interventions), testing multiple fairness notions, and evaluating both preprocessing and in-processing bias-mitigation techniques, including subgroup and aggregate analyses.
Data: Education Longitudinal Study of 2002 (ELS:2002), a nationally representative cohort of students who were 10th graders in 2002. The sample is restricted to those who attended four-year postsecondary institutions (n=15,244 after listwise deletion). Outcome: binary indicator of bachelor’s degree or higher by the third follow-up (eight years after expected high school graduation) with label=1 for bachelor’s+ and label=0 otherwise. Predictors: 29 commonly used student-success features spanning demographics, socioeconomic status, academic performance, college preparation, and school experiences (see Appendix A in paper). Categorical variables are one-hot encoded per NCES documentation. Missing data: Models were run with and without multiple imputation (Rubin, 1996); main results presented without imputation. Sensitive attributes (race/ethnicity) and the outcome were never imputed; observations missing these were dropped prior to imputation. Imputation had minimal effect except for SVM, where it reduced unfairness variance. Evaluation design: Data were split into training/testing sets (80/20) with stratification by outcome class and racial/ethnic categories to preserve distributions; results averaged over 30 random splits. Hyperparameter tuning used five-fold cross-validation on the training set, assessing grids of feasible hyperparameters for each model and split. Models: Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM), reflecting common use in higher education analytics. Fairness notions: Four group fairness metrics were evaluated—Statistical Parity, Equal Opportunity (equal false negative rates), Predictive Equality (equal false positive rates), and Equalized Odds (parity in both false positive and true positive rates). Bias mitigation: Preprocessing—(1) Reweighting (Kamiran & Calders, 2012) to upweight underrepresented successful cases in unprivileged groups; (2) Disparate Impact Remover (Feldman et al., 2015) to reduce correlations between features and group membership. In-processing—(3) Exponentiated Gradient Reduction (Agarwal et al., 2018) enforcing fairness constraints during training; (4) Meta Fair Classifier (Celis et al., 2018) optimizing under specified fairness constraints. Post-processing was excluded due to typically inferior performance and controversy in educational settings. Comparisons: Fairness was evaluated at (a) subgroup level (each racial/ethnic group vs. all others) and (b) aggregate level with a binarized sensitive attribute: privileged (Asian, White) vs. unprivileged (Black, Hispanic, Two or More Races). This design assesses both the impact of mitigation methods that require binary sensitive attributes and the risk of masking subgroup differences through aggregation.
Model performance: Average test accuracy across baseline models (no mitigation) was approximately 78%, with no significant accuracy differences among DT, RF, LR, and SVM. Unfairness at subgroup level: Across fairness notions, Black and Hispanic students experienced greater unfairness. SVM tended to yield comparatively less unfair results than other models. Illustrative baseline disparities: In one test split (RF), predicted probability of attainment (Statistical Parity context) was about 91% for Asian and White vs. 63% for Black and 68% for Hispanic. For Predictive Equality among those who did not complete a degree (y=0), the model’s positive prediction rates were ~83% (Asian), 78% (White), 33% (Hispanic), and 0% (Black). For Equal Opportunity among those who completed a degree (y=1), the model’s false negative rates implied predicted failure probabilities of ~20% (Hispanic), 8% (Black), 5.5% (Asian), and 4.6% (White). Variation in unfairness metrics was lowest for White and Asian groups and substantially larger for minoritized groups, indicating sensitivity to train/test splits due to underrepresentation. Unfairness at aggregate level: Aggregating to privileged vs. unprivileged revealed higher false negative rates for unprivileged groups, but masked subgroup differences visible in the disaggregated analysis. Mitigation effectiveness (RF focus; similar patterns for other models): Accuracy changed minimally (−1% to −2%) for preprocessing and ExGR, whereas the Meta Fair Classifier (MetaC) notably increased accuracy by about 10–17 points over baseline. However, mitigation techniques were generally ineffective at reducing bias at the aggregate level. At the subgroup level, no single technique improved fairness for all racial/ethnic subgroups simultaneously; gains for one subgroup often coincided with harms to another.
- Reweighting (ReW): Did not effectively reduce bias for unprivileged groups, suggesting underrepresentation of successful cases is not the primary bias source.
- Disparate Impact Remover (DIR): Reduced unfairness for Black students and diminished the advantage for Asian students, but worsened unfairness for Hispanic students and, in some notions (Statistical Parity and Equal Opportunity), exacerbated the White group’s advantage. DIR could not achieve statistical parity across subgroups.
- Exponentiated Gradient Reduction (ExGR): Produced little change in privilege/unfairness patterns and increased variability across splits.
- Meta Fair Classifier (MetaC): Reduced all four bias types for Hispanic students but increased unfairness for Black students (e.g., in Statistical Parity and Equalized Odds), again emphasizing the need for subgroup-specific evaluation. Overall: Common preprocessing and in-processing techniques did not substantially mitigate demographic bias; only MetaC showed partial subgroup-specific improvements alongside tradeoffs.
The study demonstrates that widely used student-success prediction models can reproduce and legitimize racial inequities: they tend to predict success more often for White and Asian students and are more likely to falsely predict failure for successful Black and Hispanic students. These findings directly answer the research questions by documenting substantial biases across multiple fairness notions and showing that leading mitigation techniques yield limited and uneven improvements, particularly when evaluated at the subgroup level. The work underscores practical implications for admissions, course recommendations, and intervention allocation, where biased predictions may restrict opportunities for racially minoritized students. It also highlights the importance of methodological choices—model selection, fairness metric, and mitigation approach—which carry significant consequences for equity. Aggregation into privileged vs. unprivileged can obscure critical subgroup differences; therefore, fairness evaluation and mitigation must be disaggregated to avoid masking harms. The limited effectiveness of current techniques suggests that biases in educational data are rooted in deep systemic inequities rather than mere statistical underrepresentation, calling for more sophisticated approaches and careful, context-aware deployment of predictive models in higher education.
This paper contributes a comprehensive audit of algorithmic fairness in predicting bachelor’s degree attainment using a nationally representative dataset, multiple ML models, and four fairness notions, alongside evaluations of preprocessing and in-processing mitigation methods. Key contributions include evidence of persistent unfairness against Black and Hispanic students, demonstration that subgroup analyses reveal inequities masked by aggregate comparisons, and findings that common mitigation techniques provide limited and uneven benefits. Future research should (1) develop and test more effective mitigation strategies, including combined pre- and in-processing approaches; (2) systematically examine the role of training/testing splits and sample representation; (3) study how feature selection and modeling choices affect fairness; and (4) further disaggregate racial/ethnic categories beyond binary privileged/unprivileged groupings to better reflect students’ diverse experiences and outcomes.
- Sensitive attribute binarization required by some mitigation methods may mask subgroup-specific harms; results show aggregation can conceal important differences.
- The ELS race/ethnicity categories are limited, and the “two or more races” category showed high variability, suggesting caution in interpretation.
- Models showed higher variance in fairness metrics for underrepresented groups, indicating sensitivity to train/test splits and limited robustness for minoritized subpopulations.
- Main results rely on listwise deletion; although multiple imputation was examined, imputation choices can influence both fairness and performance (with SVM showing reduced unfairness variance under imputation).
- Post-processing mitigation techniques were excluded; while this aligns with practical and ethical concerns in education, it limits the scope of evaluated methods.
- The feature set, while commonly used (p=29), may omit relevant variables; unobserved systemic and historical factors likely drive residual bias.
- The sample is restricted to students attending four-year institutions, which may limit generalizability to other postsecondary contexts.
Related Publications
Explore these studies to deepen your understanding of the subject.

