Archive4

Defending Against Data Poisoning Attacks on LLMs: A Comprehensive Guide
Data poisoning attacks can corrupt LLMs during training, fine-tuning, and RAG retrieval.

Jailbreaking LLMs: A Comprehensive Guide (With Examples)
From simple prompt tricks to sophisticated context manipulation, discover how LLM jailbreaks actually work.

Beyond DoS: How Unbounded Consumption is Reshaping LLM Security
OWASP replaced DoS attacks with "unbounded consumption" in their 2025 Top 10.

Red Team Your LLM with BeaverTails
Evaluate LLM safety using BeaverTails dataset with 700+ harmful prompts spanning harassment, violence, and deception categories.

How to run CyberSecEval
Even top models fail 25-50% of prompt injection attacks.

Leveraging Promptfoo for EU AI Act Compliance
The EU AI Act bans specific AI behaviors starting February 2025.

How to Red Team an Ollama Model: Complete Local LLM Security Testing Guide
Running LLMs locally with Ollama? These models often bypass cloud safety filters.

How to Red Team a HuggingFace Model: Complete Security Testing Guide
Open source models on HuggingFace often lack safety training.

Introducing GOAT—Promptfoo's Latest Strategy
Meet GOAT: our advanced multi-turn jailbreaking strategy that uses AI attackers to break AI defenders.
RAG Data Poisoning: Key Concepts Explained
Attackers can poison RAG knowledge bases to manipulate AI responses.