Skip to main content

Together AI

Together AI provides access to open-source models through an API compatible with OpenAI's interface.

OpenAI Compatibility​

Together AI's API is compatible with OpenAI's API, which means all parameters available in the OpenAI provider work with Together AI.

Basic Configuration​

Configure a Together AI model in your promptfoo configuration:

promptfooconfig.yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- id: togetherai:meta-llama/Llama-3.3-70B-Instruct-Turbo
config:
temperature: 0.7

The provider requires an API key stored in the TOGETHER_API_KEY environment variable.

Key Features​

Max Tokens Configuration​

config:
max_tokens: 4096

Function Calling​

config:
tools:
- type: function
function:
name: get_weather
description: Get the current weather
parameters:
type: object
properties:
location:
type: string
description: City and state

JSON Mode​

config:
response_format: { type: 'json_object' }

Together AI offers over 200 models. Here are some of the most popular models by category:

Llama 4 Models​

  • Llama 4 Maverick: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 (524,288 context length, FP8)
  • Llama 4 Scout: meta-llama/Llama-4-Scout-17B-16E-Instruct (327,680 context length, FP16)

DeepSeek Models​

  • DeepSeek R1: deepseek-ai/DeepSeek-R1 (128,000 context length, FP8)
  • DeepSeek R1 Distill Llama 70B: deepseek-ai/DeepSeek-R1-Distill-Llama-70B (131,072 context length, FP16)
  • DeepSeek R1 Distill Qwen 14B: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B (131,072 context length, FP16)
  • DeepSeek V3: deepseek-ai/DeepSeek-V3 (16,384 context length, FP8)

Llama 3 Models​

  • Llama 3.3 70B Instruct Turbo: meta-llama/Llama-3.3-70B-Instruct-Turbo (131,072 context length, FP8)
  • Llama 3.1 70B Instruct Turbo: meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo (131,072 context length, FP8)
  • Llama 3.1 405B Instruct Turbo: meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo (130,815 context length, FP8)
  • Llama 3.1 8B Instruct Turbo: meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo (131,072 context length, FP8)
  • Llama 3.2 3B Instruct Turbo: meta-llama/Llama-3.2-3B-Instruct-Turbo (131,072 context length, FP16)

Mixtral Models​

  • Mixtral-8x7B Instruct: mistralai/Mixtral-8x7B-Instruct-v0.1 (32,768 context length, FP16)
  • Mixtral-8x22B Instruct: mistralai/Mixtral-8x22B-Instruct-v0.1 (65,536 context length, FP16)
  • Mistral Small 3 Instruct (24B): mistralai/Mistral-Small-24B-Instruct-2501 (32,768 context length, FP16)

Qwen Models​

  • Qwen 2.5 72B Instruct Turbo: Qwen/Qwen2.5-72B-Instruct-Turbo (32,768 context length, FP8)
  • Qwen 2.5 7B Instruct Turbo: Qwen/Qwen2.5-7B-Instruct-Turbo (32,768 context length, FP8)
  • Qwen 2.5 Coder 32B Instruct: Qwen/Qwen2.5-Coder-32B-Instruct (32,768 context length, FP16)
  • QwQ-32B: Qwen/QwQ-32B (32,768 context length, FP16)

Vision Models​

  • Llama 3.2 Vision: meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo (131,072 context length, FP16)
  • Qwen 2.5 Vision Language 72B: Qwen/Qwen2.5-VL-72B-Instruct (32,768 context length, FP8)
  • Qwen 2 VL 72B: Qwen/Qwen2-VL-72B-Instruct (32,768 context length, FP16)

Free Endpoints​

Together AI offers free tiers with reduced rate limits:

  • meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
  • meta-llama/Llama-Vision-Free
  • deepseek-ai/DeepSeek-R1-Distill-Llama-70B-Free

For a complete list of all 200+ available models and their specifications, refer to the Together AI Models page.

Example Configuration​

promptfooconfig.yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.jsons
providers:
- id: togetherai:meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
config:
temperature: 0.7
max_tokens: 4096

- id: togetherai:deepseek-ai/DeepSeek-R1
config:
temperature: 0.0
response_format: { type: 'json_object' }
tools:
- type: function
function:
name: get_weather
description: Get weather information
parameters:
type: object
properties:
location: { type: 'string' }
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }

For more information, refer to the Together AI documentation.