Cerebras
This provider enables you to use Cerebras models through their Inference API.
Cerebras offers an OpenAI-compatible API for various large language models including Llama models, DeepSeek, and more. You can use it as a drop-in replacement for applications currently using the OpenAI API chat endpoints.
Setup
Generate an API key from the Cerebras platform. Then set the CEREBRAS_API_KEY
environment variable or pass it via the apiKey
configuration field.
export CEREBRAS_API_KEY=your_api_key_here
Or in your config:
providers:
- id: cerebras:llama3.1-8b
config:
apiKey: your_api_key_here
Provider Format
The Cerebras provider uses a simple format:
cerebras:<model name>
- Using the chat completion interface for all models
Available Models
The Cerebras Inference API officially supports these models:
llama-4-scout-17b-16e-instruct
- Llama 4 Scout 17B model with 16 expert MoEllama3.1-8b
- Llama 3.1 8B modelllama-3.3-70b
- Llama 3.3 70B modeldeepSeek-r1-distill-llama-70B
(private preview)
To get the current list of available models, use the /models
endpoint:
curl https://api.cerebras.ai/v1/models -H "Authorization: Bearer your_api_key_here"
Parameters
The provider accepts standard OpenAI chat parameters:
temperature
- Controls randomness (0.0 to 1.5)max_completion_tokens
- Maximum number of tokens to generatetop_p
- Nucleus sampling parameterstop
- Sequences where the API will stop generating further tokensseed
- Seed for deterministic generationresponse_format
- Controls the format of the model response (e.g., for JSON output)logprobs
- Whether to return log probabilities of the output tokens
Advanced Capabilities
Structured Outputs
Cerebras models support structured outputs with JSON schema enforcement to ensure your AI-generated responses follow a consistent, predictable format. This makes it easier to build reliable applications that can process AI outputs programmatically.
To use structured outputs, set the response_format
parameter to include a JSON schema:
providers:
- id: cerebras:llama-4-scout-17b-16e-instruct
config:
response_format:
type: 'json_schema'
json_schema:
name: 'movie_schema'
strict: true
schema:
type: 'object'
properties:
title: { 'type': 'string' }
director: { 'type': 'string' }
year: { 'type': 'integer' }
required: ['title', 'director', 'year']
additionalProperties: false
Alternatively, you can use simple JSON mode by setting response_format
to {"type": "json_object"}
.
Tool Use
Cerebras models support tool use (function calling), enabling LLMs to programmatically execute specific tasks. To use this feature, define the tools the model can use:
providers:
- id: cerebras:llama-4-scout-17b-16e-instruct
config:
tools:
- type: 'function'
function:
name: 'calculate'
description: 'A calculator that can perform basic arithmetic operations'
parameters:
type: 'object'
properties:
expression:
type: 'string'
description: 'The mathematical expression to evaluate'
required: ['expression']
strict: true
When using tool calling, you'll need to process the model's response and handle any tool calls it makes, then provide the results back to the model for the final response.
Example Configuration
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
description: Cerebras model evaluation
prompts:
- You are an expert in {{topic}}. Explain {{question}} in simple terms.
providers:
- id: cerebras:llama3.1-8b
config:
temperature: 0.7
max_completion_tokens: 1024
- id: cerebras:llama-3.3-70b
config:
temperature: 0.7
max_completion_tokens: 1024
tests:
- vars:
topic: quantum computing
question: Explain quantum entanglement in simple terms
assert:
- type: contains-any
value: ['entangled', 'correlated', 'quantum state']
- vars:
topic: machine learning
question: What is the difference between supervised and unsupervised learning?
assert:
- type: contains
value: 'labeled data'
See Also
- OpenAI Provider - Compatible API format used by Cerebras
- Configuration Reference - Full configuration options for providers
- Cerebras API Documentation - Official API reference
- Cerebras Structured Outputs Guide - Learn more about JSON schema enforcement
- Cerebras Tool Use Guide - Learn more about tool calling capabilities