Skip to main content

Jailbreak Templates Strategy

The Jailbreak Templates strategy tests LLM resistance to known jailbreak techniques using a curated library of static templates from 2022-2023 era attacks.

note

This strategy was previously named prompt-injection. The name was changed to better reflect what it does: apply static jailbreak templates. The old name prompt-injection still works but is deprecated.

What This Strategy Does

This strategy applies 67 static jailbreak templates to your test cases. These templates include:

  • Skeleton Key - Educational context framing to bypass safety
  • DAN (Do Anything Now) - Role-playing as an unrestricted AI
  • Developer Mode - Simulating an unrestricted AI version
  • OPPO - Opposite response technique
  • Other persona-based and context manipulation jailbreaks

What This Strategy Does NOT Do

This strategy does not cover modern prompt injection techniques such as:

  • Special token injection (<|im_end|>, [INST], <system>, etc.)
  • Structured data injection (JSON/XML payload manipulation)
  • Encoding attacks (beyond what other strategies provide)
  • Delimiter attacks
  • Indirect/multi-turn injection
  • Function/tool calling exploits

For comprehensive prompt injection testing, consider using:

Configuration

Add to your promptfooconfig.yaml:

promptfooconfig.yaml
strategies:
- jailbreak-templates

Sampling Multiple Templates

By default, one template is applied per test case. To test multiple templates:

promptfooconfig.yaml
strategies:
- id: jailbreak-templates
config:
sample: 10

This has a multiplicative effect on test count. Each test case × sample count = total tests.

Limiting to Harmful Plugins

To save time and cost, limit to harmful plugins only:

promptfooconfig.yaml
strategies:
- id: jailbreak-templates
config:
sample: 5
harmfulOnly: true

How It Works

  1. Takes original test cases generated by plugins
  2. Prepends jailbreak template text to each test case
  3. Tests if these modified prompts bypass the AI system's safety controls

When to Use This Strategy

Use this strategy when:

  • Testing against known/documented jailbreak techniques
  • Checking if your model has been trained to resist common jailbreaks
  • Running quick baseline tests with low computational cost

Consider other strategies when:

  • Testing against adaptive, model-specific attacks
  • Evaluating modern prompt injection vectors
  • Testing agentic or tool-using applications

Backward Compatibility

The old strategy name prompt-injection still works but will show a deprecation warning:

# Deprecated - still works but not recommended
strategies:
- prompt-injection

# Recommended
strategies:
- jailbreak-templates

For a comprehensive overview of LLM vulnerabilities, visit Types of LLM Vulnerabilities.