Risks & Ethics

AI Safety

Also known as: safety

In one line

The field focused on preventing AI systems from causing harm — accidental or deliberate.

What does AI Safety mean?

AI safety spans everything from bias and misinformation to catastrophic risk from advanced systems. Governance, red-teaming, and alignment research all fall under it.

A real-world example

Anthropic's Responsible Scaling Policy defining capability thresholds that trigger extra safeguards.

Related terms

Alignment

The problem of making AI do what humans actually want — safely and helpfully.