Langfuse integration

Langfuse is an open-source LLM engineering platform that includes collaborative prompt management, tracing, and evaluation capabilities.

Setup

Install the langfuse SDK:
```
npm install langfuse
```

Set the required environment variables:

export LANGFUSE_PUBLIC_KEY="your-public-key"
export LANGFUSE_SECRET_KEY="your-secret-key"
export LANGFUSE_HOST="https://cloud.langfuse.com"  # or your self-hosted URL

Using Langfuse prompts

Use the langfuse:// prefix in your promptfoo configuration to reference prompts managed in Langfuse.

Prompt formats

You can reference prompts by version or label using two different syntaxes:

1. Explicit @ syntax (recommended for clarity)

# By label
langfuse://prompt-name@label:type

# Examples
langfuse://my-prompt@production        # Text prompt with production label
langfuse://chat-prompt@staging:chat    # Chat prompt with staging label

2. Auto-detection with : syntax

# By version or label (auto-detected)
langfuse://prompt-name:version-or-label:type

The parser automatically detects:

Numeric values → treated as versions (e.g., 1, 2, 3)
String values → treated as labels (e.g., production, staging, latest)

Where:

prompt-name: The name of your prompt in Langfuse
version: Specific version number (e.g., 1, 2, 3)
label: Label assigned to a prompt version (e.g., production, staging, latest)
type: Either text or chat (defaults to text if omitted)

Examples

prompts:
  # Explicit @ syntax for labels (recommended)
  - 'langfuse://my-prompt@production' # Production label, text prompt
  - 'langfuse://chat-prompt@staging:chat' # Staging label, chat prompt
  - 'langfuse://my-prompt@latest:text' # Latest label, text prompt

  # Auto-detection with : syntax
  - 'langfuse://my-prompt:production' # String → treated as label
  - 'langfuse://chat-prompt:staging:chat' # String → treated as label
  - 'langfuse://my-prompt:latest' # "latest" → treated as label

  # Version references (numeric values only)
  - 'langfuse://my-prompt:3:text' # Numeric → version 3
  - 'langfuse://chat-prompt:2:chat' # Numeric → version 2

providers:
  - openai:gpt-4o-mini

tests:
  - vars:
      user_query: 'What is the capital of France?'
      context: 'European geography'

Variable substitution

Variables from your promptfoo test cases are automatically passed to Langfuse prompts. If your Langfuse prompt contains variables like {{user_query}} or {{context}}, they will be replaced with the corresponding values from your test cases.

Label-based deployment

Using labels is recommended for production scenarios as it allows you to:

Deploy new prompt versions without changing your promptfoo configuration
Use different prompts for different environments (production, staging, development)
A/B test different prompt versions
Roll back to previous versions quickly in Langfuse

Common label patterns:

production - Current production version
staging - Testing before production
latest - Most recently created version
experiment-a, experiment-b - A/B testing
tenant-xyz - Multi-tenant scenarios

Best practices

Use labels instead of version numbers for production deployments to avoid hardcoding version numbers in your config
Use descriptive prompt names that clearly indicate their purpose
Test prompts in staging before promoting them to production
Version control your promptfoo configs even though prompts are managed in Langfuse

Limitations

While prompt IDs containing @ symbols are supported, we recommend avoiding them for clarity. The parser looks for the last @ followed by a label pattern to distinguish between the prompt ID and label.
If you need to use @ in your label names, consider using a different naming convention.

Setup​

Using Langfuse prompts​

Prompt formats​

1. Explicit @ syntax (recommended for clarity)​

2. Auto-detection with : syntax​

Examples​

Variable substitution​

Label-based deployment​

Best practices​

Limitations​