Python
Promptfoo is written in TypeScript and runs via Node.js, but it has first-class Python support. You can use Python for any part of your eval pipeline without writing JavaScript.
Use Python for:
- Providers: call custom models, wrap APIs, run Hugging Face/PyTorch
- Assertions: validate outputs with custom scoring logic
- Test generators: load test cases from databases, APIs, or generate them programmatically
- Prompts: build prompts dynamically based on test variables
- Framework integrations: test LangChain, LangGraph, CrewAI, and other agent frameworks
The file:// prefix tells promptfoo to execute a Python function. Promptfoo automatically detects your Python installation.
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
prompts:
- file://prompts.py:create_prompt # Python generates the prompt
providers:
- file://provider.py # Python calls the model
tests:
- file://tests.py:generate_tests # Python generates test cases
defaultTest:
assert:
- type: python # Python validates the output
value: file://assert.py:check
from openai import OpenAI
client = OpenAI()
def call_api(prompt, options, context):
response = client.responses.create(
model="gpt-5.1-mini",
input=prompt,
)
return {"output": response.output_text}
npx promptfoo@latest init --example python-provider
Providers
Use file:// to reference a Python file:
providers:
- file://provider.py # Uses call_api() by default
- file://provider.py:custom_function # Specify a function name
Your function receives three arguments and returns a dict:
def call_api(prompt, options, context): # or: async def call_api(...)
# prompt: string or JSON-encoded messages
# options: {"config": {...}} from YAML
# context: {"vars": {...}} from test case
return {
"output": "response text",
# Optional:
"tokenUsage": {"total": 100, "prompt": 20, "completion": 80},
"cost": 0.001,
}
Assertions
Use type: python to run custom validation:
assert:
# Inline expression (returns bool or float 0-1)
- type: python
value: "'keyword' in output.lower()"
# External file
- type: python
value: file://assert.py
For external files, define a get_assert function:
def get_assert(output, context):
# Return bool, float (0-1), or detailed result
return {
"pass": True,
"score": 0.9,
"reason": "Meets criteria",
}
Test Generators
Load or generate test cases from Python:
tests:
- file://tests.py:generate_tests
def generate_tests(config=None):
# Load from database, API, files, etc.
return [
{"vars": {"input": "test 1"}, "assert": [{"type": "contains", "value": "expected"}]},
{"vars": {"input": "test 2"}},
]
Pass configuration from YAML:
tests:
- path: file://tests.py:generate_tests
config:
max_cases: 100
category: 'safety'
Prompts
Build prompts dynamically:
prompts:
- file://prompts.py:create_prompt
def create_prompt(context):
# Return string or chat messages
return [
{"role": "system", "content": "You are an expert."},
{"role": "user", "content": f"Explain {context['vars']['topic']}"},
]
Framework Integrations
Test Python agent frameworks by wrapping them as providers:
| Framework | Example | Guide |
|---|---|---|
| LangGraph | langgraph | Evaluate LangGraph agents |
| LangChain | langchain-python | Test LLM chains |
| CrewAI | crewai | Evaluate CrewAI agents |
| OpenAI Agents | openai-agents | Multi-turn agent workflows |
| PydanticAI | pydantic-ai | Type-safe agents with Pydantic |
| Google ADK | google-adk-example | Google Agent Development Kit |
| Strands Agents | strands-agents | AWS open-source agent framework |
To get started with any example:
npx promptfoo@latest init --example langgraph
Jupyter / Colab
# Install
!npm install -g promptfoo
# Create config
%%writefile promptfooconfig.yaml
prompts:
- "Explain {{topic}}"
providers:
- openai:gpt-4.1-mini
tests:
- vars:
topic: machine learning
# Run
!npx promptfoo eval
Configuration
Python Path
Set a custom Python executable:
export PROMPTFOO_PYTHON=/path/to/python3
Or configure per-provider in YAML:
providers:
- id: file://provider.py
config:
pythonExecutable: ./venv/bin/python
Module Paths
Add directories to the Python path:
export PYTHONPATH=/path/to/modules:$PYTHONPATH
Debugging
Enable debug output to see Python execution details:
LOG_LEVEL=debug npx promptfoo eval
Troubleshooting
See Python provider troubleshooting for common issues like Python not found, module import errors, and timeout problems.