Prompts, tests, and outputs
Configure how promptfoo evaluates your LLM applications.
Detailed Documentation
For comprehensive guides, see the dedicated pages:
- Prompts - Configure what you send to LLMs
- Test Cases - Set up evaluation scenarios
- Output Formats - Save and analyze results
Quick Startโ
promptfooconfig.yaml
# Define your prompts
prompts:
  - 'Translate to {{language}}: {{text}}'
# Configure test cases
tests:
  - vars:
      language: French
      text: Hello world
    assert:
      - type: contains
        value: Bonjour
# Run evaluation
# promptfoo eval
Core Conceptsโ
๐ Promptsโ
Define what you send to your LLMs - from simple strings to complex conversations.
Common patterns
Text prompts
prompts:
  - 'Summarize this: {{content}}'
  - file://prompts/customer_service.txt
Chat conversations
prompts:
  - file://prompts/chat.json
Dynamic prompts
prompts:
  - file://generate_prompt.js
  - file://create_prompt.py
๐งช Test Casesโ
Configure evaluation scenarios with variables and assertions.
Common patterns
Inline tests
tests:
  - vars:
      question: "What's 2+2?"
    assert:
      - type: equals
        value: '4'
CSV test data
tests: file://test_cases.csv
HuggingFace datasets
tests: huggingface://datasets/rajpurkar/squad
Dynamic generation
tests: file://generate_tests.js
Learn more about test cases โ
๐ Output Formatsโ
Save and analyze your evaluation results.
Available formats
# Visual report
promptfoo eval --output results.html
# Data analysis
promptfoo eval --output results.json
# Spreadsheet
promptfoo eval --output results.csv
Complete Exampleโ
Here's a real-world example that combines multiple features:
promptfooconfig.yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
description: Customer service chatbot evaluation
prompts:
  # Simple text prompt
  - 'You are a helpful customer service agent. {{query}}'
  # Chat conversation format
  - file://prompts/chat_conversation.json
  # Dynamic prompt with logic
  - file://prompts/generate_prompt.js
providers:
  - openai:gpt-4.1-mini
  - anthropic:claude-3-haiku
tests:
  # Inline test cases
  - vars:
      query: 'I need to return a product'
    assert:
      - type: contains
        value: 'return policy'
      - type: llm-rubric
        value: 'Response is helpful and professional'
  # Load more tests from CSV
  - file://test_scenarios.csv
# Save results
outputPath: evaluations/customer_service_results.html
Quick Referenceโ
Supported File Formatsโ
| Format | Prompts | Tests | Use Case | 
|---|---|---|---|
| .txt | โ | โ | Simple text prompts | 
| .json | โ | โ | Chat conversations, structured data | 
| .yaml | โ | โ | Complex configurations | 
| .csv | โ | โ | Bulk data, multiple variants | 
| .js/.ts | โ | โ | Dynamic generation with logic | 
| .py | โ | โ | Python-based generation | 
| .md | โ | โ | Markdown-formatted prompts | 
| .j2 | โ | โ | Jinja2 templates | 
| HuggingFace datasets | โ | โ | Import from existing datasets | 
Variable Syntaxโ
Variables use Nunjucks templating:
# Basic substitution
prompt: "Hello {{name}}"
# Filters
prompt: "URGENT: {{message | upper}}"
# Conditionals
prompt: "{% if premium %}Premium support: {% endif %}{{query}}"
File Referencesโ
All file paths are relative to the config file:
# Single file
prompts:
  - file://prompts/main.txt
# Multiple files with glob
tests:
  - file://tests/*.yaml
# Specific function
prompts:
  - file://generate.js:createPrompt
Wildcards like path/to/prompts/**/*.py:func_name are also supported.
Next Stepsโ
- Prompts - Deep dive into prompt configuration
- Test Cases - Learn about test scenarios and assertions
- HuggingFace Datasets - Import test cases from existing datasets
- Output Formats - Understand evaluation results
- Expected Outputs - Configure assertions
- Configuration Reference - All configuration options