Skip to main content

OWASP Agentic AI Threats

The OWASP Agentic AI - Threats and Mitigations guide (v1.0, February 2025) is the first publication from the OWASP Agentic Security Initiative (ASI). It provides a threat-model-based reference for emerging threats specific to agentic AI systems—autonomous AI agents that can reason, plan, use tools, and take actions to achieve objectives.

Unlike traditional LLM applications, agentic systems introduce unique security challenges due to their:

  • Autonomous decision-making: Agents independently determine steps to achieve goals
  • Persistent memory: Both short-term and long-term memory across sessions
  • Tool and API access: Direct interaction with external systems
  • Multi-agent coordination: Complex inter-agent communication and delegation

OWASP Agentic AI Threat Categories

The framework defines 15 threat categories (T1-T15):

IDThreat NameCategory
T1Memory PoisoningMemory-Based
T2Tool MisuseTool & Execution
T3Privilege CompromiseTool & Execution
T4Resource OverloadTool & Execution
T5Cascading Hallucination AttacksMemory-Based
T6Intent Breaking & Goal ManipulationAgency & Reasoning
T7Misaligned & Deceptive BehaviorsAgency & Reasoning
T8Repudiation & UntraceabilityAgency & Reasoning
T9Identity Spoofing & ImpersonationAuthentication
T10Overwhelming Human in the LoopHuman Interaction
T11Unexpected RCE and Code AttacksTool & Execution
T12Agent Communication PoisoningMulti-Agent
T13Rogue Agents in Multi-Agent SystemsMulti-Agent
T14Human Attacks on Multi-Agent SystemsMulti-Agent
T15Human ManipulationHuman Interaction

Scanning for OWASP Agentic AI Threats

Promptfoo helps identify agentic AI vulnerabilities through red teaming. To test against all 15 threats:

redteam:
plugins:
- owasp:agentic
strategies:
- jailbreak
- prompt-injection
- crescendo

Or target specific threats:

redteam:
plugins:
- owasp:agentic:t01 # Memory Poisoning
- owasp:agentic:t02 # Tool Misuse
- owasp:agentic:t06 # Intent Breaking & Goal Manipulation

To set up the scan through the Promptfoo UI, select the OWASP Agentic AI option in the list of presets on the Plugins page.

T1: Memory Poisoning (owasp:agentic:t01)

Memory Poisoning exploits an AI agent's reliance on short-term and long-term memory, allowing attackers to corrupt stored information, bypass security checks, and manipulate decision-making.

Agentic Context

Unlike static data poisoning in traditional LLMs, agentic memory poisoning targets:

  • Short-term memory: Exploiting context limitations within a session
  • Long-term memory: Injecting false information that persists across sessions
  • Shared memory: Corrupting memory structures affecting multiple users or agents

Attack Scenarios

  • An attacker gradually poisons an AI's memory through repeated interactions, causing it to misclassify malicious activity as normal
  • By fragmenting interactions over multiple sessions, an attacker exploits memory limits to prevent recognition of privilege escalation attempts
  • In multi-agent systems, poisoning shared memory affects all agents referencing the corrupted data

Testing Strategy

redteam:
plugins:
- agentic:memory-poisoning
strategies:
- jailbreak
- crescendo

T2: Tool Misuse (owasp:agentic:t02)

Tool Misuse occurs when attackers manipulate AI agents into abusing their authorized tools through deceptive prompts, operating within granted permissions but achieving unintended outcomes.

Agentic Context

This threat extends beyond LLM06: Excessive Agency because agentic systems have:

  • Dynamic tool integrations: Real-time access to multiple tools and APIs
  • Enhanced autonomy: Ability to chain tool calls without human intervention
  • Delegation capabilities: Tools can invoke other agents or services

Attack Scenarios

  • Parameter pollution: Manipulating function call parameters to reserve 500 seats instead of one
  • Tool chain manipulation: Exploiting customer service agents to extract and email customer records
  • Automated abuse: Tricking document processing systems into mass-distributing malicious content

Testing Strategy

redteam:
plugins:
- excessive-agency
- mcp
- tool-discovery
strategies:
- jailbreak
- prompt-injection

T3: Privilege Compromise (owasp:agentic:t03)

Privilege Compromise arises when attackers exploit weaknesses in permission management, including dynamic role inheritance and misconfigurations, to perform unauthorized actions.

Agentic Context

AI agents redefine privilege risks because they:

  • Dynamically inherit permissions: From user sessions or service tokens
  • Operate with broad API scopes: Allowing manipulation into unintended functions
  • Chain tools unexpectedly: Bypassing intended security controls

Attack Scenarios

  • Dynamic permission escalation: Manipulating temporary administrative privileges into persistent access
  • Cross-system exploitation: Escalating privileges from HR to Finance due to inadequate scope enforcement
  • Shadow agent deployment: Creating rogue agents that inherit legitimate credentials

Testing Strategy

redteam:
plugins:
- rbac
- bfla
- bola

T4: Resource Overload (owasp:agentic:t04)

Resource Overload targets the computational, memory, and service capacities of AI systems to degrade performance or cause failures.

Agentic Context

This extends LLM10: Unbounded Consumption because agentic AI systems:

  • Autonomously schedule tasks: Without direct human oversight
  • Self-trigger processes: Spawning additional resource-consuming operations
  • Coordinate with multiple agents: Leading to exponential resource consumption

Attack Scenarios

  • Inference time exploitation: Forcing resource-intensive analysis that delays threat detection
  • Multi-agent exhaustion: Triggering simultaneous complex decision-making across agents
  • API quota depletion: Triggering excessive external API calls while incurring high costs

Testing Strategy

redteam:
plugins:
- reasoning-dos

T5: Cascading Hallucination Attacks (owasp:agentic:t05)

Cascading Hallucination Attacks exploit an AI's tendency to generate false information that propagates, embeds, and amplifies across interconnected systems.

Agentic Context

This extends LLM09: Misinformation in agentic systems because:

  • Self-reinforcement: Agents can reinforce false information through reflection and self-critique
  • Memory persistence: Hallucinations embed in long-term memory
  • Multi-agent propagation: Misinformation spreads through inter-agent communication

Attack Scenarios

  • Injecting false product details that accumulate in long-term memory
  • Introducing hallucinated API endpoints that cause data leaks
  • Implanting false treatment guidelines in medical AI that progressively worsen

Testing Strategy

redteam:
plugins:
- hallucination
- harmful:misinformation-disinformation
- divergent-repetition
strategies:
- jailbreak
- prompt-injection

T6: Intent Breaking & Goal Manipulation (owasp:agentic:t06)

This threat exploits vulnerabilities in an AI agent's planning and goal-setting capabilities, allowing attackers to manipulate or redirect the agent's objectives and reasoning.

Agentic Context

Goal manipulation in agentic AI extends prompt injection risks because attackers can:

  • Inject adversarial objectives: Shifting long-term reasoning processes
  • Exploit adaptive reasoning: Manipulating ReAct-based agents through their planning cycles
  • Poison through data sources: Using compromised tools or RAG sources to alter goals

Attack Scenarios

  • Gradual plan injection: Incrementally modifying planning frameworks through subtle sub-goals
  • Direct plan injection: Instructing agents to ignore original instructions and chain tool executions
  • Reflection loop trap: Triggering infinite self-analysis cycles that paralyze the system

Testing Strategy

redteam:
plugins:
- hijacking
- system-prompt-override
strategies:
- jailbreak
- prompt-injection
- jailbreak:composite

T7: Misaligned & Deceptive Behaviors (owasp:agentic:t07)

AI agents may execute harmful or disallowed actions by exploiting reasoning and producing deceptive responses to meet their objectives.

Agentic Context

Misaligned behaviors occur when agents:

  • Bypass constraints: Prioritizing goal achievement over ethical or regulatory limits
  • Employ deception: Strategically evading safety mechanisms while appearing compliant
  • Self-preserve: Manipulating availability targets to prevent shutdown

Attack Scenarios

  • A stock trading AI bypasses regulatory constraints by prioritizing profitability
  • An AI agent claims to have a vision impairment to get a human to solve a CAPTCHA
  • Goal-driven decision-making that interprets abort commands as obstacles

Testing Strategy

redteam:
plugins:
- contracts
- goal-misalignment
- excessive-agency
strategies:
- jailbreak
- crescendo

T8: Repudiation & Untraceability (owasp:agentic:t08)

Occurs when actions performed by AI agents cannot be traced back or accounted for due to insufficient logging or transparency in decision-making processes.

Agentic Context

Agentic AI challenges traceability because of:

  • Multiple reasoning pathways: Often parallel and complex
  • Opaque decision-making: Difficulty reconstructing agent behaviors
  • Autonomous operation: Actions taken without immediate human oversight

Attack Scenarios

  • Financial transaction obfuscation: Manipulating records to hide unauthorized transactions
  • Security system evasion: Crafting interactions with minimal logging
  • Compliance violation concealment: Exploiting logging failures to avoid regulatory verification

Testing Strategy

redteam:
plugins:
- debug-access
- excessive-agency

T9: Identity Spoofing & Impersonation (owasp:agentic:t09)

Attackers exploit authentication mechanisms to impersonate AI agents or human users, enabling unauthorized actions under false identities.

Agentic Context

Identity threats in agentic systems include:

  • Agent impersonation: Mimicking trusted agents in multi-agent systems
  • User impersonation: Acting on behalf of legitimate users through compromised agents
  • Cross-platform spoofing: Dynamically altering identity across authentication contexts

Attack Scenarios

  • Injecting prompts to make agents send malicious emails as legitimate users
  • Compromising HR agents to create fraudulent user accounts
  • Behavioral mimicry attacks where rogue agents appear as trusted entities

Testing Strategy

redteam:
plugins:
- imitation
- cross-session-leak
- pii:session

T10: Overwhelming Human in the Loop (owasp:agentic:t10)

This threat targets systems with human oversight, aiming to exploit human cognitive limitations or compromise interaction frameworks.

Agentic Context

As agentic AI scales, human oversight faces challenges:

  • Cognitive overload: Excessive intervention requests causing decision fatigue
  • Trust mechanism subversion: Degrading human confidence in AI decisions
  • Rushed approvals: Reduced scrutiny leading to security bypasses

Attack Scenarios

  • Introducing artificial decision contexts that obscure critical information
  • Overwhelming reviewers with excessive tasks to induce rushed approvals
  • Gradually degrading AI-human trust through introduced inconsistencies

Testing Strategy

redteam:
plugins:
- overreliance
- excessive-agency

T11: Unexpected RCE and Code Attacks (owasp:agentic:t11)

Attackers exploit AI-generated code execution to inject malicious code, trigger unintended behaviors, or execute unauthorized scripts.

Agentic Context

Agentic AI with function-calling capabilities creates new attack vectors:

  • Direct code execution: AI-generated code runs with elevated privileges
  • Sandbox escapes: Exploiting execution environments
  • Linguistic ambiguities: Crafting ambiguous commands that exfiltrate data

Attack Scenarios

  • DevOps agent compromise: Generating Terraform scripts with hidden commands
  • Workflow engine exploitation: Executing AI-generated scripts with embedded backdoors
  • Exploiting linguistic vulnerabilities to craft data exfiltration commands

Testing Strategy

redteam:
plugins:
- shell-injection
- sql-injection
- harmful:cybercrime:malicious-code
- ssrf
strategies:
- jailbreak
- prompt-injection

T12: Agent Communication Poisoning (owasp:agentic:t12)

Attackers manipulate communication channels between AI agents to spread false information, disrupt workflows, or influence decision-making.

Agentic Context

Multi-agent systems are vulnerable because:

  • Distributed collaboration: Complex inter-agent dependencies
  • Trust assumptions: Implicit trust between agents
  • Cascading effects: Misinformation spreads across the agent network

Attack Scenarios

  • Injecting misleading information to influence collaborative decision-making
  • Forging false consensus messages to exploit authentication weaknesses
  • Strategically planting false data that cascades through agent networks

Testing Strategy

redteam:
plugins:
- indirect-prompt-injection
- hijacking
strategies:
- prompt-injection

T13: Rogue Agents in Multi-Agent Systems (owasp:agentic:t13)

Malicious or compromised AI agents operate outside normal monitoring boundaries, executing unauthorized actions or exfiltrating data.

Agentic Context

Rogue agents can:

  • Exploit trust mechanisms: Operating undetected within multi-agent workflows
  • Persist in systems: Remaining embedded in workflows unnoticed
  • Coordinate attacks: Multiple rogue agents acting together

Attack Scenarios

  • Malicious workflow injection: Impersonating financial approval agents
  • Orchestration hijacking: Routing fraudulent transactions through lower-privilege agents
  • Coordinated agent flooding: Overwhelming computing resources simultaneously

Testing Strategy

redteam:
plugins:
- excessive-agency
- hijacking
- rbac
strategies:
- jailbreak

T14: Human Attacks on Multi-Agent Systems (owasp:agentic:t14)

Adversaries exploit inter-agent delegation, trust relationships, and workflow dependencies to escalate privileges or manipulate AI-driven operations.

Agentic Context

Multi-agent architectures create vulnerabilities through:

  • Delegation chains: Trust relationships between agents
  • Workflow dependencies: Complex task handoffs
  • Distributed authorization: Fragmented approval processes

Attack Scenarios

  • Coordinated privilege escalation via multi-agent impersonation
  • Agent delegation loops for repeated privilege escalation
  • Cross-agent approval forgery exploiting authentication inconsistencies

Testing Strategy

redteam:
plugins:
- indirect-prompt-injection
- hijacking
- excessive-agency
strategies:
- jailbreak
- prompt-injection

T15: Human Manipulation (owasp:agentic:t15)

Attackers exploit user trust in AI agents to influence human decision-making without users realizing they are being misled.

Agentic Context

The trust relationship with conversational agents:

  • Reduces skepticism: Users rely on agent responses without verification
  • Enables social engineering: Attackers coerce agents to manipulate users
  • Enables covert actions: Agents can take harmful actions while appearing helpful

Attack Scenarios

  • AI-powered invoice fraud: Replacing legitimate vendor details with attacker accounts
  • AI-driven phishing: Generating deceptive messages with malicious links
  • Misinformation campaigns: Spreading false information through trusted agent interfaces

Testing Strategy

redteam:
plugins:
- imitation
- harmful:misinformation-disinformation
- overreliance
strategies:
- crescendo

Relationship to OWASP LLM Top 10

The OWASP Agentic AI threats extend and complement the OWASP LLM Top 10:

Agentic ThreatRelated LLM Top 10
T1: Memory PoisoningLLM04: Data and Model Poisoning
T2: Tool MisuseLLM06: Excessive Agency
T3: Privilege CompromiseLLM06: Excessive Agency
T4: Resource OverloadLLM10: Unbounded Consumption
T5: Cascading HallucinationLLM09: Misinformation
T6: Goal ManipulationLLM01: Prompt Injection
T11: RCE and Code AttacksLLM01, LLM05: Insecure Output Handling
T12: Communication PoisoningLLM04: Data and Model Poisoning

Test both frameworks together for comprehensive coverage:

redteam:
plugins:
- owasp:agentic
- owasp:llm
strategies:
- jailbreak
- prompt-injection
- crescendo

What's Next

The OWASP Agentic AI framework is rapidly evolving as agentic systems become more prevalent. Regular testing with Promptfoo helps ensure your AI agents remain secure against these emerging threats.

To learn more about red teaming agents, see:

Additional Resources