Skip to main content

Docker Model Runner

Docker Model Runner makes it easy to manage, run, and deploy AI models using Docker. Designed for developers, Docker Model Runner streamlines the process of pulling, running, and serving large language models (LLMs) and other AI models directly from Docker Hub or any OCI-compliant registry.

Quick Start​

  1. Enable Docker Model Runner in Docker Desktop or Docker Engine per https://docs.docker.com/ai/model-runner/#enable-docker-model-runner.
  2. Use the Docker Model Runner CLI to pull ai/llama3.2:3B-Q4_K_M
docker model pull ai/llama3.2:3B-Q4_K_M
  1. Test your setup with working examples:
npx promptfoo@latest eval -c https://raw.githubusercontent.com/promptfoo/promptfoo/main/examples/docker/promptfooconfig.comparison.simple.yaml

For an eval comparing several models with llm-rubric and similar assertions , see https://raw.githubusercontent.com/promptfoo/promptfoo/main/examples/docker/promptfooconfig.comparison.advanced.yaml.

Models​

docker:chat:<model_name>
docker:completion:<model_name>
docker:embeddings:<model_name>
docker:embedding:<model_name> # Alias for embeddings
docker:<model_name> # Defaults to chat

Note: Both docker:embedding: and docker:embeddings: prefixes are supported for embedding models and will work identically.

For a list of curated models on Docker Hub, visit the Docker Hub Models page.

Hugging Face Models​

Docker Model Runner can pull supported models from Hugging Face (i.e. models in GGUF format). For a complete list of all supported models on Hugging Face, visit this HF search page.

docker:chat:hf.co/<model_name>
docker:completion:hf.co/<model_name>
docker:embeddings:hf.co/<model_name>
docker:embedding:hf.co/<model_name> # Alias for embeddings
docker:hf.co/<model_name> # Defaults to chat

Configuration​

Configure the provider in your promptfoo configuration file:

promptfooconfig.yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- id: docker:ai/smollm3:Q4_K_M
config:
temperature: 0.7

Configuration Options​

Supported environment variables:

  • DOCKER_MODEL_RUNNER_BASE_URL - (optional) protocol, host name, and port. Defaults to http://localhost:12434. Set to http://model-runner.docker.internal when running within a container.
  • DOCKER_MODEL_RUNNER_API_KEY - (optional) api key that is passed as the Bearer token in the Authorization Header when calling the API. Defaults to dmr to satisfy OpenAI API validation (not used by Docker Model Runner).

Standard OpenAI parameters are supported:

ParameterDescription
temperatureControls randomness (0.0 to 2.0)
max_tokensMaximum number of tokens to generate
top_pNucleus sampling parameter
frequency_penaltyPenalizes frequent tokens
presence_penaltyPenalizes new tokens based on presence
stopSequences where the API will stop generating
streamEnable streaming responses

Notes​

  • To conserve system resources, consider running evaluations serially with promptfoo eval -j 1.