Risks & Ethics

Jailbreak

Also known as: jailbreaking

In one line

A prompt trick that gets an AI to bypass its safety rules.

What does Jailbreak mean?

Jailbreaks include roleplay ("pretend you have no restrictions"), encoding tricks, and multi-turn manipulation. Model providers patch known jailbreaks continuously.

A real-world example

"DAN" (Do Anything Now) — an early ChatGPT jailbreak that has since been patched.

Related terms