Blog | Promptfoo

Archive4

AI agents are powerful but vulnerable.

Our red team analysis found DeepSeek-R1 fails 60%+ of harmful content tests.

Analysis of DeepSeek-R1 censorship using 1,156 political prompts exposing CCP content filtering and bias detection patterns.

LangChain apps combine multiple AI components, creating complex attack surfaces.

Data poisoning attacks can corrupt LLMs during training, fine-tuning, and RAG retrieval.

From simple prompt tricks to sophisticated context manipulation, discover how LLM jailbreaks actually work.

OWASP replaced DoS attacks with "unbounded consumption" in their 2025 Top 10.

Evaluate LLM safety using BeaverTails dataset with 700+ harmful prompts spanning harassment, violence, and deception categories.

Even top models fail 25-50% of prompt injection attacks.

The EU AI Act bans specific AI behaviors starting February 2025.