Archive4

Red Team Your LLM with BeaverTails
Ian Webster · 12/22/2024
Test your LLM against 700+ harmful prompts across 14 risk categories.

How to run CyberSecEval
Ian Webster · 12/21/2024
Even top models fail 25-50% of prompt injection attacks.

Leveraging Promptfoo for EU AI Act Compliance
Vanessa Sauter · 12/10/2024
The EU AI Act bans specific AI behaviors starting February 2025.

How to Red Team an Ollama Model: Complete Local LLM Security Testing Guide
Ian Webster · 11/23/2024
Running LLMs locally with Ollama? These models often bypass cloud safety filters.

How to Red Team a HuggingFace Model: Complete Security Testing Guide
Ian Webster · 11/20/2024
Open source models on HuggingFace often lack safety training.

Introducing GOAT—Promptfoo's Latest Strategy
Vanessa Sauter · 11/5/2024
Meet GOAT: our advanced multi-turn jailbreaking strategy that uses AI attackers to break AI defenders.
RAG Data Poisoning: Key Concepts Explained
Ian Webster · 11/4/2024
Attackers can poison RAG knowledge bases to manipulate AI responses.

Does Fuzzing LLMs Actually Work?
Vanessa Sauter · 10/17/2024
Traditional fuzzing fails against LLMs.

How Do You Secure RAG Applications?
Vanessa Sauter · 10/14/2024
RAG applications face unique security challenges beyond foundation models.
Prompt Injection: A Comprehensive Guide
Ian Webster · 10/9/2024
Prompt injections are the most critical LLM vulnerability.