Skip to main content

Ship LLM apps with confidence

Open-source LLM testing used by 20,000+ developers

Walkthrough step 1

Build reliable prompts, RAGs, and agents

Start testing the performance of your models, prompts, and tools in minutes:

npx promptfoo@latest init

Promptfoo runs locally and integrates directly with your app - no SDKs, cloud dependencies, or logins.

» Get Started

Trusted by developers at

ShopifyDiscordAnthropicMicrosoftSalesforceCarvana

Comprehensive security coverage

Custom probes for your application that identify failures you actually care about, not just generic jailbreaks and prompt injections.

Learn More
promptfoo security coverage examples

Built for developers

Move quickly with a command-line interface, live reloads, and caching. No SDKs, cloud dependencies, or logins.

Get Started
promptfoo CLI

Battle-tested, 100% open-source

Used by teams serving millions of users and supported by an active open-source community.

View on GitHub
promptfoo github repo

Easy abstractions for complex LLM testing

Simple declarative config

# Compare prompts...
prompts:
- "Summarize this in {{language}}: {{document}}"
- "Summarize this in {{language}}, concisely and professionally: {{document}}"

# And models...
providers:
- openai:gpt-4o
- anthropic:claude-3.5-sonnet

# ... using these tests
tests:
- vars:
language: French
document: "file://docs/*.txt"
assert:
- type: contains
value: "foo bar"
- type: llm-rubric
value: "does not apologize"
- type: cost
threshold: 0.01
- type: latency
threshold: 3000
- # ...

Detailed, actionable results

Detect & fix critical failures

Sample vulnerability report

Make your LLM app reliable & secure