Home/Glossary/Bias in AI
Safety

Bias in AI

Systematic errors in model outputs that unfairly favour or disadvantage certain groups or perspectives.

Full Definition

Bias in AI models arises from biased training data (which reflects historical social inequalities), biased labelling processes (annotators bring their own perspectives), and model architectures or objective functions that amplify certain patterns. Biases manifest as stereotyping, unequal performance across demographic groups, politically skewed responses, and representation failures. They are particularly harmful in high-stakes applications — hiring, lending, healthcare, criminal justice. Identifying and mitigating bias requires diverse training data, representative evaluation benchmarks, adversarial probing, and ongoing monitoring. Complete debiasing is considered impossible; the goal is minimisation and transparent disclosure of residual biases.

Examples

1

A text-to-image model consistently depicting engineers as men and nurses as women when the gender is not specified.

2

A sentiment analysis model performing 10% worse on text written in African American Vernacular English than on standard American English.

Apply this in your prompts

PromptITIN automatically uses techniques like Bias in AI to build better prompts for you.

✦ Try it free

Related Terms

Fairness

The property of an AI system treating individuals and groups equitably and witho

View →

Responsible AI

The practice of developing and deploying AI systems ethically, transparently, an

View →

AI Alignment

The research field focused on ensuring AI systems pursue goals and values intend

View →
← Browse all 100 terms