Skip to main content
This quickstart will walk your through running your first model with Ollama. To get started, download Ollama on macOS, Windows or Linux. Download Ollama

Run a model

Open a terminal and run the command:
ollama run gemma3
See a full list of available models here.

Coding

For coding use cases, we recommend using the glm-4.7-flash model. Note: this model requires 23 GB of VRAM with 64000 tokens context length.
ollama pull glm-4.7-flash 
Alternatively, you can use a more powerful cloud model (with full context length):
ollama pull glm-4.7:cloud
Use ollama launch to quickly set up a coding tool with Ollama models:
ollama launch

Supported integrations

  • OpenCode - Open-source coding assistant
  • Claude Code - Anthropic’s agentic coding tool
  • Codex - OpenAI’s coding assistant
  • Droid - Factory’s AI coding agent

Launch with a specific model

ollama launch claude --model glm-4.7-flash

Configure without launching

ollama launch claude --config