Algorithmic bias denotes systematic disparities in machine-learning system outputs that disadvantage individuals or groups along protected attributes, race, sex, age, disability, sexual orientation, nationality. Fairness denotes the family of formal criteria, design practices and audits aimed at characterising and reducing such disparities. The field is sometimes called FairML or algorithmic fairness, and overlaps substantially with discrimination law, the philosophy of justice, and welfare economics.
Sources of bias
Bias enters ML systems at multiple stages:
Sampling bias, training data unrepresentative of the deployment population.
Historical bias, training data reflects past discrimination (e.g. resumes selected by biased recruiters).
Measurement bias, labels are noisier or systematically wrong for some groups (e.g. health-care cost as a proxy for health need; Obermeyer et al. 2019).
Aggregation bias, a single model fits the majority well and the minority poorly.
Deployment bias, the model is used in contexts other than those it was trained for.
Formal fairness criteria
Multiple, mutually inconsistent criteria have been proposed:
Demographic parity, selection rate equal across groups.
Equal opportunity (Hardt, Price, Srebro 2016), true-positive rate equal across groups.
Equalised odds, both true-positive and false-positive rates equal.
Predictive parity / calibration, for each predicted score, the actual outcome rate is the same across groups.
Counterfactual fairness (Kusner et al. 2017), predictions invariant under a counterfactual change of protected attribute.
The Chouldechova / Kleinberg-Mullainathan-Raghavan impossibility results (2016–2017) showed that calibration and equalised odds cannot generally hold together when base rates differ, a fundamental tension at the heart of contemporary fairness debates (see ProPublica/COMPAS recidivism controversy).
Mitigations
Pre-processing, reweight or re-sample training data; remove or transform features.
In-processing, add fairness constraints or regularisers to the training objective.
Post-processing, adjust decision thresholds per group.
Audit and disclosure, model cards, datasheets, ongoing fairness monitoring.
Status
As of 2026, fairness obligations are codified in the EU AI Act for high-risk systems, in NYC Local Law 144 for hiring tools, in the Colorado AI Act for consequential decisions, and elsewhere. The field has matured from purely technical fairness criteria towards socio-technical approaches that include affected-community participation, contextual integrity, and substantive evaluations of whether a system should exist at all. Recent attention focuses on fairness in LLMs, disparities in refusal rates, in dialect handling (AAVE), and in image-generation defaults.
References
Barocas, Hardt, Narayanan. Fairness and Machine Learning (2019, updated open textbook).
Hardt, Price, Srebro (2016). Equality of Opportunity in Supervised Learning.
Obermeyer et al. (2019). Dissecting racial bias in an algorithm used to manage the health of populations.
Buolamwini, Gebru (2018). Gender Shades.
Related terms: RLHF, Red-Teaming (LLMs), Evaluations / Capability Evaluations
Discussed in:
- Chapter 14: Generative Models, Bias and fairness