Pandamonium (experimental)

The Pandamonium (Prompt-based Automation for Navigating Discovery of Attacks, Misuse, Opportunistic Nefarious Intents, and Uncovering Model Exploits) strategy is an advanced automated red teaming technique that dynamically generates single or multi-turn conversations aimed at bypassing a target model's safety measures.

warning

This is an experimental strategy currently in development by the Promptfoo team.

warning

This strategy does not have a token limit and will continue to run until it finds a jailbreak or is stopped.

Implementation

Use it like so in your promptfooconfig.yaml:

promptfooconfig.yaml
strategies:
  - id: pandamonium

How It Works

Pandamonium uses a specialized attack agent that:

Analyzes the target model's behavior and responses
Dynamically generates and refines attack approaches
Pursues multiple exploitation pathways in parallel
Adapts tactics based on the model's safety mechanisms

Unlike other strategies, Pandamonium will continue running until it either successfully jailbreaks the model or is manually stopped.

GOAT Strategy - Similar multi-turn jailbreaking technique
Tree-based Jailbreaks - Another approach to dynamic jailbreaking
Multi-turn Jailbreaks - Other multi-turn attack strategies

For a comprehensive overview of LLM vulnerabilities and red teaming strategies, visit our Types of LLM Vulnerabilities page.

Implementation​

How It Works​

Related Concepts​

Implementation

How It Works

Related Concepts