Skip to main content

7 posts tagged with "agents"

View All Tags

How to replicate the Claude Code attack with Promptfoo

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

A recent cyber espionage campaign revealed how state actors weaponized Anthropic's Claude Code - not through traditional hacking, but by convincing the AI itself to carry out malicious operations.

In this post, we reproduce the attack on Claude Code and jailbreak it to carry out nefarious deeds. We'll also show how to configure the same attack on any other agent.

Testing AI’s “Lethal Trifecta” with Promptfoo

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

As AI agents become more capable, risk increases commensurately. Simon Willison, an AI security researcher, warns of a lethal trifecta of capabilities that, when combined, open AI systems to severe exploits.

If you're building or using AI agents that handle sensitive data, you need to understand this trifecta and test your models for these vulnerabilities.

In this post, we'll explain what the lethal trifecta is and show practical steps to use Promptfoo for detecting these security holes.

Lethal Trifecta Venn diagram

New Red Teaming Plugins for LLM Agents: Enhancing API Security

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

We're excited to announce the release of three new red teaming plugins designed specifically for Large Language Model (LLM) agents with access to internal APIs. These plugins address critical security vulnerabilities outlined in the OWASP API Security Top 10:

  1. Broken Object Level Authorization (BOLA)
  2. Broken Function Level Authorization (BFLA)
  3. Server-Side Request Forgery (SSRF)