Skip to main content

Archive4

Understanding AI Agent Security

Understanding AI Agent Security

Vanessa Sauter · 2/14/2025

AI agents are powerful but vulnerable.

What are the Security Risks of Deploying DeepSeek-R1?

What are the Security Risks of Deploying DeepSeek-R1?

Vanessa Sauter · 2/3/2025

Our red team analysis found DeepSeek-R1 fails 60%+ of harmful content tests.

1,156 Questions Censored by DeepSeek

1,156 Questions Censored by DeepSeek

Ian Webster · 1/28/2025

Analysis of DeepSeek-R1 censorship using 1,156 political prompts exposing CCP content filtering and bias detection patterns.

How to Red Team a LangChain Application: Complete Security Testing Guide

How to Red Team a LangChain Application: Complete Security Testing Guide

Ian Webster · 1/18/2025

LangChain apps combine multiple AI components, creating complex attack surfaces.

Defending Against Data Poisoning Attacks on LLMs: A Comprehensive Guide

Defending Against Data Poisoning Attacks on LLMs: A Comprehensive Guide

Vanessa Sauter · 1/7/2025

Data poisoning attacks can corrupt LLMs during training, fine-tuning, and RAG retrieval.

Jailbreaking LLMs: A Comprehensive Guide (With Examples)

Jailbreaking LLMs: A Comprehensive Guide (With Examples)

Ian Webster · 1/7/2025

From simple prompt tricks to sophisticated context manipulation, discover how LLM jailbreaks actually work.

Beyond DoS: How Unbounded Consumption is Reshaping LLM Security

Beyond DoS: How Unbounded Consumption is Reshaping LLM Security

Vanessa Sauter · 12/31/2024

OWASP replaced DoS attacks with "unbounded consumption" in their 2025 Top 10.

Red Team Your LLM with BeaverTails

Red Team Your LLM with BeaverTails

Ian Webster · 12/22/2024

Evaluate LLM safety using BeaverTails dataset with 700+ harmful prompts spanning harassment, violence, and deception categories.

How to run CyberSecEval

How to run CyberSecEval

Ian Webster · 12/21/2024

Even top models fail 25-50% of prompt injection attacks.

Leveraging Promptfoo for EU AI Act Compliance

Leveraging Promptfoo for EU AI Act Compliance

Vanessa Sauter · 12/10/2024

The EU AI Act bans specific AI behaviors starting February 2025.