Run a model
- CLI
- cURL
- Python
- JavaScript
Open a terminal and run the command:
Coding
For coding use cases, we recommend using theglm-4.7-flash model.
Note: this model requires 23 GB of VRAM with 64000 tokens context length.
ollama launch to quickly set up a coding tool with Ollama models:
Supported integrations
- OpenCode - Open-source coding assistant
- Claude Code - Anthropic’s agentic coding tool
- Codex - OpenAI’s coding assistant
- Droid - Factory’s AI coding agent

