Output Formats

Save and analyze your evaluation results in various formats.

Quick Start

# Interactive web viewer (default)
promptfoo eval

# Save as HTML report
promptfoo eval --output results.html

# Export as JSON for further processing
promptfoo eval --output results.json

# Create CSV for spreadsheet analysis
promptfoo eval --output results.csv

# Generate XML for integration with other tools
promptfoo eval --output results.xml

Available Formats

HTML Report

Generate a visual, shareable report:

promptfoo eval --output report.html

Features:

Interactive table with sorting and filtering
Side-by-side output comparison
Pass/fail statistics
Shareable standalone file

Use when: Presenting results to stakeholders or reviewing outputs visually.

JSON Output

Export complete evaluation data:

promptfoo eval --output results.json

Structure:

{
  "version": 3,
  "timestamp": "2024-01-15T10:30:00Z",
  "results": {
    "prompts": [...],
    "providers": [...],
    "outputs": [...],
    "stats": {...}
  }
}

Use when: Integrating with other tools or performing custom analysis.

CSV Export

Create spreadsheet-compatible data:

promptfoo eval --output results.csv

Columns include:

Test variables
Prompt used
Model outputs
Pass/fail status
Latency
Token usage

Use when: Analyzing results in Excel, Google Sheets, or data science tools.

YAML Format

Human-readable structured data:

promptfoo eval --output results.yaml

Use when: Reviewing results in a text editor or version control.

XML Format

Structured data for enterprise integrations:

promptfoo eval --output results.xml

Structure:

<promptfoo>
  <evalId>abc-123-def</evalId>
  <results>
    <version>3</version>
    <timestamp>2024-01-15T10:30:00Z</timestamp>
    <prompts>...</prompts>
    <providers>...</providers>
    <outputs>...</outputs>
    <stats>...</stats>
  </results>
  <config>...</config>
  <shareableUrl>...</shareableUrl>
</promptfoo>

Use when: Integrating with enterprise systems, XML-based workflows, or when XML is a requirement.

Configuration Options

Setting Output Path in Config

promptfooconfig.yaml
# Specify default output file
outputPath: evaluations/latest_results.html

prompts:
  - '...'
tests:
  - '...'

Multiple Output Formats

Generate multiple formats simultaneously:

# Command line
promptfoo eval --output results.html --output results.json

# Or use shell commands
promptfoo eval --output results.json && \
promptfoo eval --output results.csv

Output Contents

Standard Fields

All formats include:

Field	Description
`timestamp`	When the evaluation ran
`prompts`	Prompts used in evaluation
`providers`	LLM providers tested
`tests`	Test cases with variables
`outputs`	Raw LLM responses
`results`	Pass/fail for each assertion
`stats`	Summary statistics

Detailed Metrics

When available, outputs include:

Latency: Response time in milliseconds
Token Usage: Input/output token counts
Cost: Estimated API costs
Error Details: Failure reasons and stack traces

Analyzing Results

JSON Processing Example

const fs = require('fs');

// Load results
const results = JSON.parse(fs.readFileSync('results.json', 'utf8'));

// Analyze pass rates by provider
const providerStats = {};
results.results.outputs.forEach((output) => {
  const provider = output.provider;
  if (!providerStats[provider]) {
    providerStats[provider] = { pass: 0, fail: 0 };
  }

  if (output.pass) {
    providerStats[provider].pass++;
  } else {
    providerStats[provider].fail++;
  }
});

console.log('Pass rates by provider:', providerStats);

CSV Analysis with Pandas

import pandas as pd

# Load results
df = pd.read_csv('results.csv')

# Group by provider and calculate metrics
summary = df.groupby('provider').agg({
    'pass': 'mean',
    'latency': 'mean',
    'cost': 'sum'
})

print(summary)

Best Practices

1. Organize Output Files

project/
├── promptfooconfig.yaml
├── evaluations/
│   ├── 2024-01-15-baseline.html
│   ├── 2024-01-16-improved.html
│   └── comparison.json

2. Use Descriptive Filenames

# Include date and experiment name
promptfoo eval --output "results/$(date +%Y%m%d)-gpt4-temperature-test.html"

3. Version Control Considerations

# .gitignore
# Exclude large output files
evaluations/*.html
evaluations/*.json

# But keep summary reports
!evaluations/summary-*.csv

4. Automate Report Generation

#!/bin/bash
# run_evaluation.sh

TIMESTAMP=$(date +%Y%m%d-%H%M%S)
promptfoo eval \
  --output "reports/${TIMESTAMP}-full.json" \
  --output "reports/${TIMESTAMP}-summary.html"

Web Viewer

The default web viewer (promptfoo view) provides:

Real-time updates during evaluation
Interactive exploration
Local-only (no data sent externally)

HTML outputs are self-contained:

# Generate report
promptfoo eval --output team-review.html

# Share via email, Slack, etc.
# No external dependencies required

For collaborative review:

# Share results with your team
promptfoo share

Creates a shareable link with:

Read-only access
Commenting capabilities
No setup required for viewers

Troubleshooting

Large Output Files

For extensive evaluations:

# Limit output size
outputPath: results.json
sharing:
  # Exclude raw outputs from file
  includeRawOutputs: false

Encoding Issues

Ensure proper encoding for international content:

# Explicitly set encoding
LANG=en_US.UTF-8 promptfoo eval --output results.csv

Performance Tips

Use JSON for large datasets - Most efficient format
Generate HTML for presentations - Best visual format
Use CSV for data analysis - Easy Excel/Sheets integration
Stream outputs for huge evaluations - Process results incrementally

Configuration Reference - All output options
Integrations - Using outputs with other tools
Command Line Guide - CLI options

Quick Start​

Available Formats​

HTML Report​

JSON Output​

CSV Export​

YAML Format​

XML Format​

Configuration Options​

Setting Output Path in Config​

Multiple Output Formats​

Output Contents​

Standard Fields​

Detailed Metrics​

Analyzing Results​

JSON Processing Example​

CSV Analysis with Pandas​

Best Practices​

1. Organize Output Files​

2. Use Descriptive Filenames​

3. Version Control Considerations​

4. Automate Report Generation​

Sharing Results​

Web Viewer​

Sharing HTML Reports​

Promptfoo Share​

Troubleshooting​

Large Output Files​

Encoding Issues​

Performance Tips​

Related Documentation​