Compare commits

...

1 Commits

Author SHA1 Message Date
Parth Sareen
5267d31d56 docs: ollama launch (#13852) 2026-01-23 23:18:50 -08:00
11 changed files with 291 additions and 66 deletions

View File

@@ -4,16 +4,6 @@ title: Anthropic compatibility
Ollama provides compatibility with the [Anthropic Messages API](https://docs.anthropic.com/en/api/messages) to help connect existing applications to Ollama, including tools like Claude Code.
## Recommended models
For coding use cases, models like `glm-4.7:cloud`, `minimax-m2.1:cloud`, and `qwen3-coder` are recommended.
Pull a model before use:
```shell
ollama pull qwen3-coder
ollama pull glm-4.7:cloud
```
## Usage
### Environment variables
@@ -22,8 +12,8 @@ To use Ollama with tools that expect the Anthropic API (like Claude Code), set t
```shell
export ANTHROPIC_AUTH_TOKEN=ollama # required but ignored
export ANTHROPIC_API_KEY="" # required but ignored
export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_API_KEY=ollama # required but ignored
```
### Simple `/v1/messages` example
@@ -245,10 +235,41 @@ curl -X POST http://localhost:11434/v1/messages \
## Using with Claude Code
[Claude Code](https://code.claude.com/docs/en/overview) can be configured to use Ollama as its backend:
[Claude Code](https://code.claude.com/docs/en/overview) can be configured to use Ollama as its backend.
### Recommended models
For coding use cases, models like `glm-4.7`, `minimax-m2.1`, and `qwen3-coder` are recommended.
Download a model before use:
```shell
ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_API_KEY=ollama claude --model qwen3-coder
ollama pull qwen3-coder
```
> Note: Qwen 3 coder is a 30B parameter model requiring at least 24GB of VRAM to run smoothly. More is required for longer context lengths.
```shell
ollama pull glm-4.7:cloud
```
### Quick setup
```shell
ollama launch claude
```
This will prompt you to select a model, configure Claude Code automatically, and launch it. To configure without launching:
```shell
ollama launch claude --config
```
### Manual setup
Set the environment variables and run Claude Code:
```shell
ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_API_KEY="" claude --model qwen3-coder
```
Or set the environment variables in your shell profile:
@@ -256,19 +277,13 @@ Or set the environment variables in your shell profile:
```shell
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_API_KEY=ollama
export ANTHROPIC_API_KEY=""
```
Then run Claude Code with any Ollama model:
```shell
# Local models
claude --model qwen3-coder
claude --model gpt-oss:20b
# Cloud models
claude --model glm-4.7:cloud
claude --model minimax-m2.1:cloud
```
## Endpoints

View File

@@ -8,6 +8,47 @@ title: CLI Reference
ollama run gemma3
```
### Launch integrations
```
ollama launch
```
Configure and launch external applications to use Ollama models. This provides an interactive way to set up and start integrations with supported apps.
#### Supported integrations
- **OpenCode** - Open-source coding assistant
- **Claude Code** - Anthropic's agentic coding tool
- **Codex** - OpenAI's coding assistant
- **Droid** - Factory's AI coding agent
#### Examples
Launch an integration interactively:
```
ollama launch
```
Launch a specific integration:
```
ollama launch claude
```
Launch with a specific model:
```
ollama launch claude --model qwen3-coder
```
Configure without launching:
```
ollama launch droid --config
```
#### Multiline input
For multiline input, you can wrap text with `"""`:

View File

@@ -3,8 +3,6 @@ title: Cloud
sidebarTitle: Cloud
---
<Info>Ollama's cloud is currently in preview.</Info>
## Cloud Models
Ollama's cloud models are a new kind of model in Ollama that can run without a powerful GPU. Instead, cloud models are automatically offloaded to Ollama's cloud service while offering the same capabilities as local models, making it possible to keep using your local tools while running larger models that wouldn't fit on a personal computer.

View File

@@ -8,7 +8,7 @@ Context length is the maximum number of tokens that the model has access to in m
The default context length in Ollama is 4096 tokens.
</Note>
Tasks which require large context like web search, agents, and coding tools should be set to at least 32000 tokens.
Tasks which require large context like web search, agents, and coding tools should be set to at least 64000 tokens.
## Setting context length
@@ -24,7 +24,7 @@ Change the slider in the Ollama app under settings to your desired context lengt
### CLI
If editing the context length for Ollama is not possible, the context length can also be updated when serving Ollama.
```
OLLAMA_CONTEXT_LENGTH=32000 ollama serve
OLLAMA_CONTEXT_LENGTH=64000 ollama serve
```
### Check allocated context length and model offloading

View File

@@ -102,18 +102,19 @@
"group": "Integrations",
"pages": [
"/integrations/claude-code",
"/integrations/vscode",
"/integrations/jetbrains",
"/integrations/codex",
"/integrations/cline",
"/integrations/codex",
"/integrations/droid",
"/integrations/goose",
"/integrations/zed",
"/integrations/roo-code",
"/integrations/jetbrains",
"/integrations/marimo",
"/integrations/n8n",
"/integrations/xcode",
"/integrations/onyx",
"/integrations/marimo"
"/integrations/opencode",
"/integrations/roo-code",
"/integrations/vscode",
"/integrations/xcode",
"/integrations/zed"
]
},
{

View File

@@ -9,7 +9,7 @@ sidebarTitle: Welcome
<CardGroup cols={2}>
<Card title="Quickstart" icon="rocket" href="/quickstart">
Get up and running with your first model
Get up and running with your first model or integrate Ollama with your favorite tools
</Card>
<Card
title="Download Ollama"

View File

@@ -4,7 +4,7 @@ title: Claude Code
Claude Code is Anthropic's agentic coding tool that can read, modify, and execute code in your working directory.
Open models can be used with Claude Code through Ollama's Anthropic-compatible API, enabling you to use models such as `qwen3-coder`, `gpt-oss:20b`, or other models.
Open models can be used with Claude Code through Ollama's Anthropic-compatible API, enabling you to use models such as `glm-4.7`, `qwen3-coder`, `gpt-oss`.
![Claude Code with Ollama](https://files.ollama.com/claude-code.png)
@@ -26,12 +26,27 @@ irm https://claude.ai/install.ps1 | iex
## Usage with Ollama
### Quick setup
```shell
ollama launch claude
```
To configure without launching:
```shell
ollama launch claude --config
```
### Manual setup
Claude Code connects to Ollama using the Anthropic-compatible API.
1. Set the environment variables:
```shell
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=""
export ANTHROPIC_BASE_URL=http://localhost:11434
```
@@ -44,35 +59,17 @@ claude --model gpt-oss:20b
Or run with environment variables inline:
```shell
ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 claude --model gpt-oss:20b
ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_API_KEY="" claude --model qwen3-coder
```
**Note:** Claude Code requires a large context window. We recommend at least 32K tokens. See the [context length documentation](/context-length) for how to adjust context length in Ollama.
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) on ollama.com
2. Set the environment variables:
```shell
export ANTHROPIC_BASE_URL=https://ollama.com
export ANTHROPIC_API_KEY=<your-api-key>
```
3. Run Claude Code with a cloud model:
```shell
claude --model glm-4.7:cloud
```
**Note:** Claude Code requires a large context window. We recommend at least 64k tokens. See the [context length documentation](/context-length) for how to adjust context length in Ollama.
## Recommended Models
### Cloud models
- `glm-4.7:cloud` - High-performance cloud model
- `minimax-m2.1:cloud` - Fast cloud model
- `qwen3-coder:480b` - Large coding model
- `qwen3-coder`
- `glm-4.7`
- `gpt-oss:20b`
- `gpt-oss:120b`
Cloud models are also available at [ollama.com/search?c=cloud](https://ollama.com/search?c=cloud).
### Local models
- `qwen3-coder` - Excellent for coding tasks
- `gpt-oss:20b` - Strong general-purpose model
- `gpt-oss:120b` - Larger general-purpose model for more complex tasks

View File

@@ -13,7 +13,21 @@ npm install -g @openai/codex
## Usage with Ollama
<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>
<Note>Codex requires a larger context window. It is recommended to use a context window of at least 64k tokens.</Note>
### Quick setup
```
ollama launch codex
```
To configure without launching:
```shell
ollama launch codex --config
```
### Manual setup
To use `codex` with Ollama, use the `--oss` flag:

View File

@@ -11,10 +11,24 @@ Install the [Droid CLI](https://factory.ai/):
curl -fsSL https://app.factory.ai/cli | sh
```
<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
<Note>Droid requires a larger context window. It is recommended to use a context window of at least 64k tokens. See [Context length](/context-length) for more information.</Note>
## Usage with Ollama
### Quick setup
```bash
ollama launch droid
```
To configure without launching:
```shell
ollama launch droid --config
```
### Manual setup
Add a local configuration block to `~/.factory/config.json`:
```json
@@ -73,4 +87,4 @@ Add the cloud configuration block to `~/.factory/config.json`:
}
```
Run `droid` in a new terminal to load the new settings.
Run `droid` in a new terminal to load the new settings.

View File

@@ -0,0 +1,106 @@
---
title: OpenCode
---
OpenCode is an open-source AI coding assistant that runs in your terminal.
## Install
Install the [OpenCode CLI](https://opencode.ai):
```bash
curl -fsSL https://opencode.ai/install.sh | bash
```
<Note>OpenCode requires a larger context window. It is recommended to use a context window of at least 64k tokens. See [Context length](/context-length) for more information.</Note>
## Usage with Ollama
### Quick setup
```bash
ollama launch opencode
```
To configure without launching:
```shell
ollama launch opencode --config
```
### Manual setup
Add a configuration block to `~/.config/opencode/opencode.json`:
```json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen3-coder": {
"name": "qwen3-coder"
}
}
}
}
}
```
## Cloud Models
`glm-4.7:cloud` is the recommended model for use with OpenCode.
Add the cloud configuration to `~/.config/opencode/opencode.json`:
```json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"glm-4.7:cloud": {
"name": "glm-4.7:cloud"
}
}
}
}
}
```
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
2. Update `~/.config/opencode/opencode.json` to point to ollama.com:
```json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama Cloud",
"options": {
"baseURL": "https://ollama.com/v1"
},
"models": {
"glm-4.7:cloud": {
"name": "glm-4.7:cloud"
}
}
}
}
}
```
Run `opencode` in a new terminal to load the new settings.

View File

@@ -18,13 +18,13 @@ This quickstart will walk your through running your first model with Ollama. To
<Tab title="CLI">
Open a terminal and run the command:
```
```sh
ollama run gemma3
```
</Tab>
<Tab title="cURL">
```
```sh
ollama pull gemma3
```
@@ -45,13 +45,13 @@ This quickstart will walk your through running your first model with Ollama. To
<Tab title="Python">
Start by downloading a model:
```
```sh
ollama pull gemma3
```
Then install Ollama's Python library:
```
```sh
pip install ollama
```
@@ -101,3 +101,42 @@ This quickstart will walk your through running your first model with Ollama. To
</Tabs>
See a full list of available models [here](https://ollama.com/models).
## Coding
For coding use cases, we recommend using the `glm-4.7-flash` model.
Note: this model requires 23 GB of VRAM with 64000 tokens context length.
```sh
ollama pull glm-4.7-flash
```
Alternatively, you can use a more powerful cloud model (with full context length):
```sh
ollama pull glm-4.7:cloud
```
Use `ollama launch` to quickly set up a coding tool with Ollama models:
```sh
ollama launch
```
### Supported integrations
- [OpenCode](/integrations/opencode) - Open-source coding assistant
- [Claude Code](/integrations/claude-code) - Anthropic's agentic coding tool
- [Codex](/integrations/codex) - OpenAI's coding assistant
- [Droid](/integrations/droid) - Factory's AI coding agent
### Launch with a specific model
```sh
ollama launch claude --model glm-4.7-flash
```
### Configure without launching
```sh
ollama launch claude --config
```