OpenAI Agents
Test multi-turn agentic workflows built with the @openai/agents SDK. This provider lets you evaluate agents that use tools, hand off between specialists, and handle complex multi-step tasks.
Prerequisites
- OpenAI Agents SDK installed:
npm install @openai/agents - OpenAI API key: Set
OPENAI_API_KEYenvironment variable - Agent definition (inline or in a TypeScript/JavaScript file)
Basic Configuration
providers:
- openai:agents:my-agent
config:
agent:
name: Customer Support Agent
model: gpt-5-mini
instructions: You are a helpful customer support agent.
Full Configuration Options
All available configuration options:
providers:
- id: openai:agents:support-agent
config:
# Agent Definition (required)
# Inline agent definition
agent:
name: Support Agent
model: gpt-5-mini
instructions: |
You are a customer support agent. Help users with their questions.
Use tools when needed to look up information.
temperature: 0.7
# Or load from file
agent: file://./agents/support-agent.ts
# Tools Configuration
# Inline tools array
tools:
- name: lookup_order
description: Look up order status by order ID
parameters:
type: object
properties:
order_id:
type: string
description: The order ID to look up
required: [order_id]
execute: |
async function(args) {
return { status: 'shipped', tracking: 'ABC123' };
}
# Or load from file
tools: file://./tools/support-tools.ts
# Handoffs Configuration
# Allows agent to transfer to specialized agents
handoffs:
- agent:
name: Billing Agent
model: gpt-5-mini
instructions: Handle billing and payment questions.
description: Transfer to billing specialist for payment issues
# Or load from file
handoffs: file://./handoffs/support-handoffs.ts
# Execution Options
maxTurns: 10 # Maximum conversation turns (default: 10)
model: gpt-5 # Override the model specified in agent definition
# Model Settings
modelSettings:
temperature: 0.7
topP: 0.9
maxTokens: 2000
# Tracing (OpenTelemetry)
tracing: true # Enable OTLP tracing
Features
Multi-Turn Conversations
Agents handle multi-turn workflows automatically:
tests:
- description: Multi-step research task
vars:
query: Research the latest AI models and create a comparison table
assert:
- type: contains
value: comparison
- type: javascript
value: output.includes('GPT') && output.includes('Claude')