🤖AI Ethics

AI Bias Types

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When you're tested on AI ethics, you're not just being asked to define bias—you're being asked to demonstrate that you understand where bias enters AI systems, why it persists, and how different bias types compound to create unfair outcomes. Exam questions frequently require you to trace a discriminatory AI decision back to its root cause, whether that's flawed training data, problematic algorithm design, or human overreliance on automated outputs. Understanding the mechanism behind each bias type is what separates surface-level memorization from genuine comprehension.

These bias types don't exist in isolation. A single AI system can exhibit historical bias embedded in its training data, algorithmic bias in how it weights features, and automation bias in how humans interpret its outputs. The most challenging exam questions—especially FRQs—will ask you to identify multiple bias types operating simultaneously and explain their interaction. Don't just memorize definitions; know what stage of the AI pipeline each bias affects and what real-world harms it produces.

Biases Rooted in Training Data

The foundation of any AI system is the data it learns from. When that data reflects historical inequalities, excludes certain populations, or captures the world inaccurately, the resulting model inherits those flaws as features, not bugs.

Historical Bias

Reflects past prejudices and inequalities embedded in data collected during discriminatory eras—even "accurate" historical data can perpetuate harm
Amplification effect means AI systems don't just reproduce historical bias; they can intensify it through feedback loops
Challenges remediation efforts because removing historical bias requires active intervention, not just collecting more data

Representation Bias

Underrepresentation of marginalized groups leads to AI systems that perform poorly for those populations—facial recognition accuracy gaps are a prime example
Misrepresentation occurs when groups are included but characterized through stereotypical or limited data points
Demands diverse datasets as a minimum ethical standard, though diversity alone doesn't guarantee fairness

Data Bias

Unrepresentative or skewed training data produces models that generalize poorly to real-world populations
Societal biases become encoded when data reflects existing discrimination in hiring, lending, or policing practices
Undermines reliability of AI predictions across all downstream applications

Compare: Historical bias vs. Representation bias—both involve problematic training data, but historical bias stems from when data was collected (reflecting past discrimination), while representation bias concerns who is included or excluded. An FRQ might ask you to identify which bias type explains why an AI performs differently across demographic groups.

Biases in Data Collection Methods

Even with good intentions, how data is gathered can introduce systematic errors. These biases emerge before the algorithm ever sees the data.

Sampling Bias

Non-representative samples occur when training data doesn't reflect the true population distribution
Overgeneralization risk means conclusions drawn from biased samples may not apply to excluded groups
Common in demographic studies where convenience sampling or accessibility issues skew who gets included

Selection Bias

Systematic exclusion of certain individuals or groups from data collection—often invisible to researchers
Distorts population-level findings because missing data isn't random; it reflects structural barriers
Impacts fairness when AI applications make decisions about groups who weren't represented in training

Measurement Bias

Flawed collection tools or methods produce inaccurate data that AI systems treat as ground truth
Proxy variables can introduce bias when direct measurement is impossible—using zip codes as proxies for race, for example
Requires rigorous validation through testing across diverse conditions and populations

Compare: Sampling bias vs. Selection bias—sampling bias results from how a sample is drawn (methodology), while selection bias results from who gets systematically excluded (often due to structural factors). Both produce non-representative data, but the causes and solutions differ.

Biases in Algorithm Design and Outputs

Even with perfect data, the choices made in building and deploying algorithms can introduce or amplify unfairness. The algorithm itself is a site of ethical decision-making.

Algorithmic Bias

Systematically prejudiced results emerge from flawed assumptions in model architecture, feature selection, or optimization targets
Design choices matter because decisions about which variables to include and how to weight them encode values
High-stakes applications in hiring, lending, and criminal justice make algorithmic bias a civil rights concern

Reporting Bias

Selective reporting of results based on desired outcomes distorts understanding of AI system performance
Publication bias means failed or problematic AI applications often go unreported
Undermines accountability by preventing accurate assessment of AI harms and limitations

Compare: Algorithmic bias vs. Data bias—algorithmic bias originates in the model's design and logic, while data bias originates in the training information. A biased algorithm can produce unfair outcomes even with representative data, and vice versa. If an FRQ asks about interventions, specify whether you're addressing the algorithm, the data, or both.

Human-AI Interaction Biases

These biases emerge not from the AI system itself, but from how humans build, interpret, and rely on automated systems. They highlight the irreducibly human dimensions of AI ethics.

Confirmation Bias

Favoring information that confirms existing beliefs influences which data gets collected and how results are interpreted
Affects development when AI teams unconsciously design systems that validate their assumptions
Shapes interpretation of AI outputs, leading users to accept results that match expectations while dismissing contradictory evidence

Automation Bias

Over-reliance on automated systems leads humans to defer to AI even when contradictory information is available
Critical in high-stakes domains like healthcare diagnostics and criminal sentencing, where blind trust can cause serious harm
Underscores need for human oversight and training that encourages appropriate skepticism toward AI recommendations

Compare: Confirmation bias vs. Automation bias—confirmation bias affects how humans build and interpret AI systems, while automation bias affects how humans defer to AI outputs. Both involve cognitive shortcuts, but confirmation bias operates throughout development while automation bias operates at the point of deployment and use.

Quick Reference Table

Concept	Best Examples
Training data problems	Historical bias, Representation bias, Data bias
Data collection flaws	Sampling bias, Selection bias, Measurement bias
Algorithm design issues	Algorithmic bias, Reporting bias
Human-AI interaction	Confirmation bias, Automation bias
Perpetuates past discrimination	Historical bias, Data bias
Excludes or underrepresents groups	Representation bias, Sampling bias, Selection bias
Requires human oversight solutions	Automation bias, Confirmation bias
Affects high-stakes decisions	Algorithmic bias, Automation bias, Historical bias

Self-Check Questions

A facial recognition system performs well on light-skinned faces but poorly on dark-skinned faces because the training dataset contained mostly light-skinned individuals. Which two bias types best explain this outcome, and how do they differ?
An AI hiring tool was trained on a company's historical hiring decisions, which favored male candidates. A recruiter notices the tool ranks women lower but approves its recommendations anyway. Identify the bias types present at each stage of this scenario.
Compare and contrast sampling bias and selection bias. How might each produce a non-representative training dataset, and what different interventions would address each?
A healthcare AI trained on data from urban hospitals makes inaccurate predictions for rural patients. Is this primarily a measurement bias, representation bias, or algorithmic bias problem? Defend your answer.
If an FRQ asks you to explain how a single AI system can exhibit multiple bias types simultaneously, which three bias types would you choose to demonstrate the interaction between data, algorithm, and human factors?

🤖AI Ethics

AI Bias Types

Why This Matters

Biases Rooted in Training Data

Historical Bias

Representation Bias

Data Bias

Biases in Data Collection Methods

Sampling Bias

Selection Bias

Measurement Bias

Biases in Algorithm Design and Outputs

Algorithmic Bias

Reporting Bias

Human-AI Interaction Biases

Confirmation Bias

Automation Bias

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes