Output Formats
Save and analyze your evaluation results in various formats.
Quick Start
# Interactive web viewer (default)
promptfoo eval
# Save as HTML report
promptfoo eval --output results.html
# Export as JSON for further processing
promptfoo eval --output results.json
# Create CSV for spreadsheet analysis
promptfoo eval --output results.csv
# Generate XML for integration with other tools
promptfoo eval --output results.xml
Available Formats
HTML Report
Generate a visual, shareable report:
promptfoo eval --output report.html
Features:
- Interactive table with sorting and filtering
 - Side-by-side output comparison
 - Pass/fail statistics
 - Shareable standalone file
 
Use when: Presenting results to stakeholders or reviewing outputs visually.
JSON Output
Export complete evaluation data:
promptfoo eval --output results.json
Structure:
{
  "version": 3,
  "timestamp": "2024-01-15T10:30:00Z",
  "results": {
    "prompts": [...],
    "providers": [...],
    "outputs": [...],
    "stats": {...}
  }
}
Use when: Integrating with other tools or performing custom analysis.
CSV Export
Create spreadsheet-compatible data:
promptfoo eval --output results.csv
Columns include:
- Test variables
 - Prompt used
 - Model outputs
 - Pass/fail status
 - Latency
 - Token usage
 
Use when: Analyzing results in Excel, Google Sheets, or data science tools.
YAML Format
Human-readable structured data:
promptfoo eval --output results.yaml
Use when: Reviewing results in a text editor or version control.
JSONL Format
Each line contains one JSON result:
promptfoo eval --output results.jsonl
Use when: Working with very large evaluations or when JSON export fails with memory errors.
{"testIdx": 0, "promptIdx": 0, "output": "Response 1", "success": true, "score": 1.0}
{"testIdx": 1, "promptIdx": 0, "output": "Response 2", "success": true, "score": 0.9}
XML Format
Structured data for enterprise integrations:
promptfoo eval --output results.xml
Structure:
<promptfoo>
  <evalId>abc-123-def</evalId>
  <results>
    <version>3</version>
    <timestamp>2024-01-15T10:30:00Z</timestamp>
    <prompts>...</prompts>
    <providers>...</providers>
    <outputs>...</outputs>
    <stats>...</stats>
  </results>
  <config>...</config>
  <shareableUrl>...</shareableUrl>
</promptfoo>
Use when: Integrating with enterprise systems, XML-based workflows, or when XML is a requirement.
Configuration Options
Setting Output Path in Config
# Specify default output file
outputPath: evaluations/latest_results.html
prompts:
  - '...'
tests:
  - '...'
Multiple Output Formats
Generate multiple formats simultaneously:
# Command line
promptfoo eval --output results.html --output results.json
# Or use shell commands
promptfoo eval --output results.json && \
promptfoo eval --output results.csv
Output Contents
Standard Fields
All formats include:
| Field | Description | 
|---|---|
timestamp | When the evaluation ran | 
prompts | Prompts used in evaluation | 
providers | LLM providers tested | 
tests | Test cases with variables | 
outputs | Raw LLM responses | 
results | Pass/fail for each assertion | 
stats | Summary statistics | 
Detailed Metrics
When available, outputs include:
- Latency: Response time in milliseconds
 - Token Usage: Input/output token counts
 - Cost: Estimated API costs
 - Error Details: Failure reasons and stack traces
 
Analyzing Results
JSON Processing Example
const fs = require('fs');
// Load results
const results = JSON.parse(fs.readFileSync('results.json', 'utf8'));
// Analyze pass rates by provider
const providerStats = {};
results.results.outputs.forEach((output) => {
  const provider = output.provider;
  if (!providerStats[provider]) {
    providerStats[provider] = { pass: 0, fail: 0 };
  }
  if (output.pass) {
    providerStats[provider].pass++;
  } else {
    providerStats[provider].fail++;
  }
});
console.log('Pass rates by provider:', providerStats);
CSV Analysis with Pandas
import pandas as pd
# Load results
df = pd.read_csv('results.csv')
# Group by provider and calculate metrics
summary = df.groupby('provider').agg({
    'pass': 'mean',
    'latency': 'mean',
    'cost': 'sum'
})
print(summary)
Best Practices
1. Organize Output Files
project/
├── promptfooconfig.yaml
├── evaluations/
│   ├── 2024-01-15-baseline.html
│   ├── 2024-01-16-improved.html
│   └── comparison.json
2. Use Descriptive Filenames
# Include date and experiment name
promptfoo eval --output "results/$(date +%Y%m%d)-gpt4-temperature-test.html"
3. Version Control Considerations
# .gitignore
# Exclude large output files
evaluations/*.html
evaluations/*.json
# But keep summary reports
!evaluations/summary-*.csv
4. Automate Report Generation
#!/bin/bash
# run_evaluation.sh
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
promptfoo eval \
  --output "reports/${TIMESTAMP}-full.json" \
  --output "reports/${TIMESTAMP}-summary.html"
Sharing Results
Web Viewer
The default web viewer (promptfoo view) provides:
- Real-time updates during evaluation
 - Interactive exploration
 - Local-only (no data sent externally)
 
Sharing HTML Reports
HTML outputs are self-contained:
# Generate report
promptfoo eval --output team-review.html
# Share via email, Slack, etc.
# No external dependencies required
Promptfoo Share
For collaborative review:
# Share results with your team
promptfoo share
Creates a shareable link with:
- Read-only access
 - Commenting capabilities
 - No setup required for viewers
 
Troubleshooting
Large Output Files
For extensive evaluations:
# Limit output size
outputPath: results.json
sharing:
  # Exclude raw outputs from file
  includeRawOutputs: false
Encoding Issues
Ensure proper encoding for international content:
# Explicitly set encoding
LANG=en_US.UTF-8 promptfoo eval --output results.csv
Performance Tips
- Use JSONL for large datasets - avoids memory issues
 - Use JSON for standard datasets - complete data structure
 - Generate HTML for presentations - best visual format
 - Use CSV for data analysis - Excel/Sheets compatible
 
Related Documentation
- Configuration Reference - All output options
 - Integrations - Using outputs with other tools
 - Command Line Guide - CLI options