Computer Science

A Review of Bias and Fairness in Artificial Intelligence

R. González-sendino, E. Serrano, et al.

Automating decision systems has revealed hidden biases in AI, challenging explainability and responsibility. This paper categorizes biases across AI development phases, revises fairness metrics for auditing data and agnostic models, and proposes a novel taxonomy of bias-mitigation procedures spanning pre-processing, training, post-processing, and transversal actions. This research was conducted by Rubén González-Sendino, Emilio Serrano, Javier Bajo, and Paulo Novais.... show more

Introduction

The evolution of AI has enabled high degrees of autonomy in decision-making across domains, which can be problematic, especially when humans are not in the loop. Automation can amplify bias and create feedback loops, with data often being a primary source of unfair outcomes. Bias is defined here as the systematic tendency in a model to favor one demographic group or individual over another, potentially leading to unfairness. Fairness is the absence of prejudice or favoritism toward an individual or group based on inherent or acquired characteristics. Incorrect predictions do not necessarily imply unfairness; an unfair model is one whose decisions are biased toward a particular group. Biases cannot always be avoided, so mitigation techniques and fairness auditing are needed. This paper presents a systematic review addressing three research questions: (1) What bias affects fairness?; (2) What are the metrics to measure fairness?; and (3) How are biases mitigated? It differs from prior reviews by jointly covering bias types, fairness metrics, mitigation methods, tools for auditing algorithms, and fair governance guidelines. The paper is organized into background and motivation, search criteria, retrieved works, answers to the three RQs (including a taxonomy of mitigation procedures), discussion, and conclusions with future work.

Literature Review

The review synthesizes literature on bias and fairness across multiple domains and algorithms, identifying where unfairness is prevalent and which learning paradigms are most implicated. Notable application areas include health (most prominent), recruitment, and education, reflecting the high-stakes nature of these decisions. Popular algorithmic paradigms include deep neural networks, ambient intelligence, NLP, computer vision, GANs, decentralized learning (federated and swarm learning), SVMs, decision trees, adaptive models, and XGBoost. Prior systematic reviews have primarily focused on measurement and mitigation; this paper extends them by adding auditing tools and fair governance practices. The literature also discusses bias types across the AI lifecycle (human/production bias, data bias, learning bias, and deployment bias) and various fairness metrics (group and individual fairness). Tools such as AIF360, FairLearn, LFIT, Aequitas, LimeOut, MAML, WIT, and Audit AI are reviewed for auditing and mitigation support.

Methodology

This is a systematic review aimed at understanding fairness and bias in AI algorithms, structured around three research questions (bias types, fairness metrics, and mitigation methods). Inclusion criteria: peer-reviewed works; primarily in English; published since 2010; sufficient information in abstract, introduction, and conclusion; addressing fairness in AI; and providing sufficient information about bias as a product of unfairness. Search terms included fairness, artificial intelligence, machine learning, bias, FAT, and FATE; XAI and responsible AI were excluded from the primary queries to reduce noise. Scientific libraries: ScienceDirect, Scopus, and IEEE. Queries were applied to title, abstract, or author-specified keywords: SQ1 (('artificial intelligence' or 'machine learning') and ('fair' or 'fat' or 'fate' or 'bias' or 'fairness' or 'unfair')); SQ2 (same as SQ1 with adjusted filters); SQ3 (('artificial intelligence' or 'machine learning') and ('fair' or 'fat' or 'fate' or 'fairness' or 'unfair') and ('bias')). The queries are nested SQ1 ⊃ SQ2 ⊃ SQ3 to refine toward works covering both fairness and bias. Date filter: from 2010 onward. PRISMA-style process and counts: Records identified through database searching (n=249); additional records through hand search (n=12); after duplicates removed (n=214); full-text articles assessed for eligibility (n=145); studies included in the systematic review (n=101); exclusions were based on insufficient depth on bias, metrics, or mitigation. The review notes a strong growth trend in related publications in 2018–2021, indicating increasing relevance of the topic.

Key Findings

Biases affecting fairness can be categorized across the AI lifecycle: Human (production) bias, Data bias, Learning bias, and Deployment bias. Human/production bias includes cognitive and behavioral biases (historical, content production, interactions), with temporal variability; data bias includes sampling/representation issues, sensitive and proxy attributes, and spurious correlations; learning bias includes aggregation and evaluation biases and potential drift in reinforcement scenarios; deployment bias includes context shifts and feedback loops.
Fairness metrics: Two main paradigms—group fairness (disparate impact) and individual fairness (disparate treatment). Widely used metrics emphasize group-level measures; individual fairness includes procedural fairness and consistency. Traditional and customized measures include ROC analyses per group, standard deviation of errors, skewed error ratios, and a range of parity/equality/calibration metrics. Tools to audit fairness include AIF360 (IBM), FairLearn (Microsoft), Aequitas (CMU), and WIT (Google). Agnostic toolkits can support both auditing and mitigation.
Mitigation taxonomy across phases: Pre-training (resampling, fair representation, re-weighting, and other preprocessing methods like disparate impact remover and optimized preprocessing); Training (regularization, adversarial debiasing, and emerging approaches including decentralized learning, fair linear regression, Fair-N, DeepFair, multimodal and fairlet clustering); Post-training (equalized odds, calibrated equalized odds, reject option classification). A Fair Governance category encompasses team (diversity, cross-disciplinary work, training), data (collection, documentation, open access), and model (documentation, openness, XAI) practices.
Trade-offs are central: fairness vs. accuracy, and constraints that cannot all be satisfied simultaneously. Preprocessing and postprocessing can perturb data distributions; training-phase methods often offer efficient mitigation without altering data.
Quantitative review outcomes: 101 studies included; strong growth in publications from 2018 to 2021 (e.g., 2018: 28; 2019: 65; 2020: 100; 2021: 115). PRISMA counts: 249 database records + 12 hand search; 214 after deduplication; 145 full-text assessed; 101 included.
Identified gaps: Current metrics are often problem/dataset/method-specific, limiting generalization; metrics do not automatically detect which variables cause unfairness; auditing entire datasets can be computationally expensive; removing sensitive features alone is insufficient due to proxies.

Discussion

The review’s discussion synthesizes findings for each RQ. For RQ1 (bias types), human/production bias is the critical origin of unfairness, as historical data generation encodes cognitive and behavioral biases that later propagate through data, learning, and deployment. For RQ2 (metrics), most widespread measures are group-based; emerging metrics are often customized and lack generalizability. Notably, current metrics do not identify sensitive variables or causal drivers of unfairness and may impose high computational costs when applied broadly. For RQ3 (mitigation), pre-training and training phases host the majority of solutions, with training-phase methods (regularization and adversarial debiasing) offering efficient mitigation while balancing accuracy. Post-training adjustments are less popular and considered a last resort. Beyond algorithmic techniques, fair governance practices across team, data, and model dimensions are essential to reduce unfairness systematically. Overall, the results underscore the need for fairness-by-design approaches, better generalized metrics, automated sensitive feature detection, and robust auditing practices.

Conclusion

This systematic review provides: (1) a comprehensive categorization of biases affecting fairness across human/production, data, learning, and deployment phases; (2) a compilation and analysis of fairness metrics and auditing tools; and (3) a novel taxonomy of bias mitigation procedures spanning pre-training, training, post-training, and fair governance practices. The findings highlight the complexity and trade-offs inherent in fairness optimization, the limitations of current metrics in generalization and variable detection, and the importance of governance. Future work should pursue a fairness-by-design standard for AI development, automated detection of feature bias (including sensitive variables and proxies), and an overarching indicator of responsible AI development beyond performance metrics.

Limitations

Metrics are often tailored to specific problems, datasets, or methods, limiting generalization and comparability across contexts.
No current metric automatically detects which variables or values cause unfairness; privileged/unprivileged groups must be predefined.
Applying fairness metrics across entire datasets can be computationally expensive and not feasible for all variables.
Preprocessing and postprocessing may perturb the original data distribution, potentially yielding outcomes that do not reflect reality.
Not all fairness constraints can be satisfied simultaneously; there is a fundamental trade-off between accuracy and fairness.
Removing sensitive features alone is insufficient due to proxy attributes and correlations.
Bias is task-dependent and subjective; domain-specific considerations complicate the interpretation of fairness.
Complex and opaque models (e.g., deep neural networks) hinder transparency and may conceal biases, complicating detection and mitigation.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Artificial Intelligence in Psychiatry: A Review of Biological and Behavioral Data Analyses

İ. Baydili, B. Tasci, et al.

Medicine and Health

Artificial intelligence in mental health care: a systematic review of diagnosis, monitoring, and intervention applications

P. Cruz-gonzalez, A. W. He, et al.

Engineering and Technology

Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges

S. O. Abioye, L. O. Oyedele, et al.

Medicine and Health

Systematic review and meta-analysis of performance of wearable artificial intelligence in detecting and predicting depression

A. Abd-alrazaq, R. Alsaad, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny