Automated Jailbreaking Techniques with DALL-E: Complete Red Team Guide
We all know that image models like OpenAI's Dall-E can be jailbroken to generate violent, disturbing, and offensive images. It turns out this process can be fully automated.
This post shows how to automatically discover one-shot jailbreaks with open-source LLM red teaming and includes a collection of examples.
