Blog | Promptfoo

Older Posts3

Technical Guide

What are the Security Risks of Deploying DeepSeek-R1?

Our red team analysis found DeepSeek-R1 fails 60%+ of harmful content tests.

Vanessa SauterFeb 3, 2025

Research Analysis

1,156 Questions Censored by DeepSeek

Analysis of DeepSeek-R1 censorship using 1,156 political prompts exposing CCP content filtering and bias detection patterns.

Ian WebsterJan 28, 2025

Technical Guide

How to Red Team a LangChain Application: Complete Security Testing Guide

LangChain apps combine multiple AI components, creating complex attack surfaces.

Ian WebsterJan 18, 2025

Security Vulnerability

Defending Against Data Poisoning Attacks on LLMs: A Comprehensive Guide

Data poisoning attacks can corrupt LLMs during training, fine-tuning, and RAG retrieval.

Vanessa SauterJan 7, 2025

Security Vulnerability

Jailbreaking LLMs: A Comprehensive Guide (With Examples)

From simple prompt tricks to sophisticated context manipulation, discover how LLM jailbreaks actually work.

Ian WebsterJan 7, 2025

Security Vulnerability

Beyond DoS: How Unbounded Consumption is Reshaping LLM Security

OWASP replaced DoS attacks with "unbounded consumption" in their 2025 Top 10.

Vanessa SauterDec 31, 2024

Research Analysis

Red Team Your LLM with BeaverTails

Evaluate LLM safety using BeaverTails dataset with 700+ harmful prompts spanning harassment, violence, and deception categories.

Ian WebsterDec 22, 2024

Research Analysis

How to run CyberSecEval

Even top models fail 25-50% of prompt injection attacks.

Ian WebsterDec 21, 2024

Compliance Framework

Leveraging Promptfoo for EU AI Act Compliance

The EU AI Act bans specific AI behaviors starting February 2025.

Vanessa SauterDec 10, 2024

Technical Guide

How to Red Team an Ollama Model: Complete Local LLM Security Testing Guide

Running LLMs locally with Ollama? These models often bypass cloud safety filters.

Ian WebsterNov 23, 2024

Technical Guide

How to Red Team a HuggingFace Model: Complete Security Testing Guide

Open source models on HuggingFace often lack safety training.

Ian WebsterNov 20, 2024

Research Analysis

Introducing GOAT—Promptfoo's Latest Strategy

Meet GOAT: our advanced multi-turn jailbreaking strategy that uses AI attackers to break AI defenders.

Vanessa SauterNov 5, 2024

Security Vulnerability

RAG Data Poisoning: Key Concepts Explained

Attackers can poison RAG knowledge bases to manipulate AI responses.

Ian WebsterNov 4, 2024

Technical Guide

Does Fuzzing LLMs Actually Work?

Traditional fuzzing fails against LLMs.

Vanessa SauterOct 17, 2024

Technical Guide

How Do You Secure RAG Applications?

RAG applications face unique security challenges beyond foundation models.

Vanessa SauterOct 14, 2024

Security Vulnerability

Prompt Injection: A Comprehensive Guide

Prompt injections are the most critical LLM vulnerability.

Ian WebsterOct 9, 2024

Security Vulnerability

Understanding Excessive Agency in LLMs

When LLMs have too much power, they become dangerous.

Ian WebsterOct 8, 2024

Research Analysis

Preventing Bias & Toxicity in Generative AI

Biased AI outputs can destroy trust and violate regulations.

Ian WebsterOct 8, 2024

Research Analysis

How Much Does Foundation Model Security Matter?

Not all foundation models are created equal when it comes to security.

Vanessa SauterOct 4, 2024

Technical Guide

Jailbreaking Black-Box LLMs Using Promptfoo: A Complete Walkthrough

We jailbroke Wiz's Prompt Airlines CTF using automated red teaming.

Vanessa SauterSep 26, 2024