Usage
Environment variables
To use Ollama with tools that expect the Anthropic API (like Claude Code), set these environment variables:Simple /v1/messages example
basic.py
Streaming example
streaming.py
Tool calling example
tools.py
Using with Claude Code
Claude Code can be configured to use Ollama as its backend.Recommended models
For coding use cases, models likeglm-4.7, minimax-m2.1, and qwen3-coder are recommended.
Download a model before use:
Note: Qwen 3 coder is a 30B parameter model requiring at least 24GB of VRAM to run smoothly. More is required for longer context lengths.
Quick setup
Manual setup
Set the environment variables and run Claude Code:Endpoints
/v1/messages
Supported features
- Messages
- Streaming
- System prompts
- Multi-turn conversations
- Vision (images)
- Tools (function calling)
- Tool results
- Thinking/extended thinking
Supported request fields
-
model -
max_tokens -
messages- Text
content - Image
content(base64) - Array of content blocks
-
tool_useblocks -
tool_resultblocks -
thinkingblocks
- Text
-
system(string or array) -
stream -
temperature -
top_p -
top_k -
stop_sequences -
tools -
thinking -
tool_choice -
metadata
Supported response fields
-
id -
type -
role -
model -
content(text, tool_use, thinking blocks) -
stop_reason(end_turn, max_tokens, tool_use) -
usage(input_tokens, output_tokens)
Streaming events
-
message_start -
content_block_start -
content_block_delta(text_delta, input_json_delta, thinking_delta) -
content_block_stop -
message_delta -
message_stop -
ping -
error
Models
Ollama supports both local and cloud models.Local models
Pull a local model before use:qwen3-coder- Excellent for coding tasksgpt-oss:20b- Strong general-purpose model
Cloud models
Cloud models are available immediately without pulling:glm-4.7:cloud- High-performance cloud modelminimax-m2.1:cloud- Fast cloud model
Default model names
For tooling that relies on default Anthropic model names such asclaude-3-5-sonnet, use ollama cp to copy an existing model name:
model field:
Differences from the Anthropic API
Behavior differences
- API key is accepted but not validated
anthropic-versionheader is accepted but not used- Token counts are approximations based on the underlying model’s tokenizer
Not supported
The following Anthropic API features are not currently supported:| Feature | Description |
|---|---|
/v1/messages/count_tokens | Token counting endpoint |
tool_choice | Forcing specific tool use or disabling tools |
metadata | Request metadata (user_id) |
| Prompt caching | cache_control blocks for caching prefixes |
| Batches API | /v1/messages/batches for async batch processing |
| Citations | citations content blocks |
| PDF support | document content blocks with PDF files |
| Server-sent errors | error events during streaming (errors return HTTP status) |
Partial support
| Feature | Status |
|---|---|
| Image content | Base64 images supported; URL images not supported |
| Extended thinking | Basic support; budget_tokens accepted but not enforced |

