Skip to main content

2 posts tagged with "case-study"

View All Tags

Jailbreaking Black-Box LLMs Using Promptfoo: A Complete Walkthrough

Vanessa Sauter
Principal Solutions Architect

Promptfoo is an open-source framework for testing LLM applications against security, privacy, and policy risks. It is designed for developers to easily discover and fix critical LLM failures. Promptfoo also offers red team tools that can be leveraged against external endpoints. These attacks are ideal for internal red teaming exercises, third-party penetration testing, and bug bounty programs, ultimately saving security researchers dozens of hours in manual prompt engineering and adversarial testing.

In this blog, we'll demonstrate how to utilize Promptfoo's red team tool in a black-box LLM security assessment. Using Wiz's AI CTF, Prompt Airlines, we'll walk you step by step through Promptfoo's configuration to ultimately find the malicious queries that broke the chatbot's guidelines.

Automated Jailbreaking Techniques with DALL-E: Complete Red Team Guide

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

We all know that image models like OpenAI's Dall-E can be jailbroken to generate violent, disturbing, and offensive images. It turns out this process can be fully automated.

This post shows how to automatically discover one-shot jailbreaks with open-source LLM red teaming and includes a collection of examples.

llm image red teaming