Compare commits

..

1 Commits

Author SHA1 Message Date
jmorganca
8b4410633d Add image generation documentation
- Add image generation capability page with API usage examples
- Add image-generation to docs.json navigation
- Update openapi.yaml with image generation request/response fields
  - Request: width, height, steps
  - Response: image, completed, total
2026-01-22 14:09:58 -08:00
7 changed files with 264 additions and 132 deletions

View File

@@ -0,0 +1,205 @@
---
title: Image Generation
---
<Warning>
Image generation is experimental and currently only available on macOS. This feature may change in future versions.
</Warning>
Image generation models create images from text prompts. Ollama supports diffusion-based image generation models through both Ollama's API and OpenAI-compatible endpoints.
## Usage
<Tabs>
<Tab title="CLI">
```shell
ollama run x/z-image-turbo "a sunset over mountains"
```
The generated image will be saved to the current directory.
</Tab>
<Tab title="cURL">
```shell
curl http://localhost:11434/api/generate -d '{
"model": "x/z-image-turbo",
"prompt": "a sunset over mountains",
"stream": false
}'
```
</Tab>
<Tab title="Python">
```python
import ollama
import base64
response = ollama.generate(
model='x/z-image-turbo',
prompt='a sunset over mountains',
)
# Save the generated image
with open('output.png', 'wb') as f:
f.write(base64.b64decode(response['image']))
print('Image saved to output.png')
```
</Tab>
<Tab title="JavaScript">
```javascript
import ollama from 'ollama'
import { writeFileSync } from 'fs'
const response = await ollama.generate({
model: 'x/z-image-turbo',
prompt: 'a sunset over mountains',
})
// Save the generated image
const imageBuffer = Buffer.from(response.image, 'base64')
writeFileSync('output.png', imageBuffer)
console.log('Image saved to output.png')
```
</Tab>
</Tabs>
### Response
The response includes an `image` field containing the base64-encoded image data:
```json
{
"model": "x/z-image-turbo",
"created_at": "2024-01-15T10:30:15.000000Z",
"image": "iVBORw0KGgoAAAANSUhEUg...",
"done": true,
"done_reason": "stop",
"total_duration": 15000000000,
"load_duration": 2000000000
}
```
## Image dimensions
Customize the output image size using the `width` and `height` parameters:
<Tabs>
<Tab title="cURL">
```shell
curl http://localhost:11434/api/generate -d '{
"model": "x/z-image-turbo",
"prompt": "a portrait of a robot artist",
"width": 768,
"height": 1024,
"stream": false
}'
```
</Tab>
<Tab title="Python">
```python
import ollama
response = ollama.generate(
model='x/z-image-turbo',
prompt='a portrait of a robot artist',
width=768,
height=1024,
)
```
</Tab>
<Tab title="JavaScript">
```javascript
import ollama from 'ollama'
const response = await ollama.generate({
model: 'x/z-image-turbo',
prompt: 'a portrait of a robot artist',
width: 768,
height: 1024,
})
```
</Tab>
</Tabs>
## Streaming progress
When streaming is enabled (the default), progress updates are sent during image generation:
```json
{
"model": "x/z-image-turbo",
"created_at": "2024-01-15T10:30:00.000000Z",
"completed": 5,
"total": 20,
"done": false
}
```
The `completed` and `total` fields indicate the current progress through the diffusion steps.
## Parameters
| Parameter | Description | Default |
|-----------|-------------|---------|
| `prompt` | Text description of the image to generate | Required |
| `width` | Width of the generated image in pixels | Model default |
| `height` | Height of the generated image in pixels | Model default |
| `steps` | Number of diffusion steps | Model default |
## OpenAI compatibility
Image generation is also available through the OpenAI-compatible `/v1/images/generations` endpoint:
<Tabs>
<Tab title="cURL">
```shell
curl http://localhost:11434/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "x/z-image-turbo",
"prompt": "a sunset over mountains",
"size": "1024x1024",
"response_format": "b64_json"
}'
```
</Tab>
<Tab title="Python">
```python
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
api_key='ollama', # required but ignored
)
response = client.images.generate(
model='x/z-image-turbo',
prompt='a sunset over mountains',
size='1024x1024',
response_format='b64_json',
)
print(response.data[0].b64_json[:50] + '...')
```
</Tab>
<Tab title="JavaScript">
```javascript
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1/',
apiKey: 'ollama', // required but ignored
})
const response = await openai.images.generate({
model: 'x/z-image-turbo',
prompt: 'a sunset over mountains',
size: '1024x1024',
response_format: 'b64_json',
})
console.log(response.data[0].b64_json.slice(0, 50) + '...')
```
</Tab>
</Tabs>
See [OpenAI compatibility](/api/openai-compatibility#v1imagesgenerations-experimental) for more details.

View File

@@ -93,6 +93,7 @@
"/capabilities/thinking",
"/capabilities/structured-outputs",
"/capabilities/vision",
"/capabilities/image-generation",
"/capabilities/embeddings",
"/capabilities/tool-calling",
"/capabilities/web-search"

View File

@@ -2,7 +2,7 @@
title: Claude Code
---
Claude Code is Anthropic's agentic coding tool that can read, modify, and execute code in your working directory.
Claude Code is Anthropic's agentic coding tool that can read, modify, and execute code in your working directory.
Open models can be used with Claude Code through Ollama's Anthropic-compatible API, enabling you to use models such as `qwen3-coder`, `gpt-oss:20b`, or other models.
@@ -26,16 +26,6 @@ irm https://claude.ai/install.ps1 | iex
## Usage with Ollama
Configure Claude Code to use Ollama:
```shell
ollama config claude
```
This will prompt you to select a model and automatically configure Claude Code to use Ollama.
<Accordion title="Manual Configuration">
Claude Code connects to Ollama using the Anthropic-compatible API.
1. Set the environment variables:
@@ -57,9 +47,7 @@ Or run with environment variables inline:
ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 claude --model gpt-oss:20b
```
</Accordion>
<Note>Claude Code requires a large context window. We recommend at least 32K tokens. See the [context length documentation](/context-length) for how to adjust context length in Ollama.</Note>
**Note:** Claude Code requires a large context window. We recommend at least 32K tokens. See the [context length documentation](/context-length) for how to adjust context length in Ollama.
## Connecting to ollama.com
@@ -87,4 +75,4 @@ claude --model glm-4.7:cloud
### Local models
- `qwen3-coder` - Excellent for coding tasks
- `gpt-oss:20b` - Strong general-purpose model
- `gpt-oss:120b` - Larger general-purpose model for more complex tasks
- `gpt-oss:120b` - Larger general-purpose model for more complex tasks

View File

@@ -2,31 +2,22 @@
title: Codex
---
Codex is OpenAI's agentic coding tool for the command line.
## Install
Install the [Codex CLI](https://developers.openai.com/codex/cli/):
```shell
```
npm install -g @openai/codex
```
## Usage with Ollama
Configure Codex to use Ollama:
```shell
ollama config codex
```
This will prompt you to select a model and automatically configure Codex to use Ollama.
<Accordion title="Manual Configuration">
<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>
To use `codex` with Ollama, use the `--oss` flag:
```shell
```
codex --oss
```
@@ -34,22 +25,20 @@ codex --oss
By default, codex will use the local `gpt-oss:20b` model. However, you can specify a different model with the `-m` flag:
```shell
```
codex --oss -m gpt-oss:120b
```
### Cloud Models
```shell
```
codex --oss -m gpt-oss:120b-cloud
```
</Accordion>
<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>
## Connecting to ollama.com
Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
To use ollama.com directly, edit your `~/.codex/config.toml` file to point to ollama.com.

View File

@@ -2,7 +2,6 @@
title: Droid
---
Droid is Factory's agentic coding tool for the command line.
## Install
@@ -12,80 +11,66 @@ Install the [Droid CLI](https://factory.ai/):
curl -fsSL https://app.factory.ai/cli | sh
```
<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
## Usage with Ollama
Configure Droid to use Ollama:
```shell
ollama config droid
```
This will prompt you to select models and automatically configure Droid to use Ollama.
<Accordion title="Manual Configuration">
Add a local configuration block to `~/.factory/settings.json`:
Add a local configuration block to `~/.factory/config.json`:
```json
{
"customModels": [
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama]",
"model": "qwen3-coder",
"displayName": "qwen3-coder [Ollama]",
"baseUrl": "http://localhost:11434/v1",
"apiKey": "ollama",
"base_url": "http://localhost:11434/v1/",
"api_key": "not-needed",
"provider": "generic-chat-completion-api",
"maxOutputTokens": 32000
"max_tokens": 32000
}
]
}
```
Adjust `maxOutputTokens` based on your model's context length (the automated setup detects this automatically).
### Cloud Models
## Cloud Models
`qwen3-coder:480b-cloud` is the recommended model for use with Droid.
Add the cloud configuration block to `~/.factory/settings.json`:
Add the cloud configuration block to `~/.factory/config.json`:
```json
{
"customModels": [
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama Cloud]",
"model": "qwen3-coder:480b-cloud",
"displayName": "qwen3-coder:480b-cloud [Ollama]",
"baseUrl": "http://localhost:11434/v1",
"apiKey": "ollama",
"base_url": "http://localhost:11434/v1/",
"api_key": "not-needed",
"provider": "generic-chat-completion-api",
"maxOutputTokens": 128000
"max_tokens": 128000
}
]
}
```
</Accordion>
<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
## Connecting to ollama.com
1. Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
2. Add the cloud configuration block to `~/.factory/settings.json`:
2. Add the cloud configuration block to `~/.factory/config.json`:
```json
{
"customModels": [
"custom_models": [
{
"model_display_name": "qwen3-coder [Ollama Cloud]",
"model": "qwen3-coder:480b",
"displayName": "qwen3-coder:480b [Ollama Cloud]",
"baseUrl": "https://ollama.com/v1",
"apiKey": "OLLAMA_API_KEY",
"base_url": "https://ollama.com/v1/",
"api_key": "OLLAMA_API_KEY",
"provider": "generic-chat-completion-api",
"maxOutputTokens": 128000
"max_tokens": 128000
}
]
}
```
Run `droid` in a new terminal to load the new settings.
Run `droid` in a new terminal to load the new settings.

View File

@@ -1,63 +0,0 @@
---
title: OpenCode
---
OpenCode is an agentic coding tool for the terminal.
## Install
Install [OpenCode](https://opencode.ai):
```shell
curl -fsSL https://opencode.ai/install | bash
```
## Usage with Ollama
Configure OpenCode to use Ollama:
```shell
ollama config opencode
```
This will prompt you to select models and automatically configure OpenCode to use Ollama.
<Accordion title="Manual Configuration">
Add the Ollama provider to `~/.config/opencode/opencode.json`:
```json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (local)",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen3-coder": {
"name": "qwen3-coder [Ollama]"
}
}
}
}
}
```
</Accordion>
<Note>OpenCode requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
## Recommended Models
### Cloud models
- `qwen3-coder:480b` - Large coding model
- `glm-4.7:cloud` - High-performance cloud model
- `minimax-m2.1:cloud` - Fast cloud model
### Local models
- `qwen3-coder` - Excellent for coding tasks
- `gpt-oss:20b` - Strong general-purpose model
- `gpt-oss:120b` - Larger general-purpose model for more complex tasks

View File

@@ -117,6 +117,15 @@ components:
top_logprobs:
type: integer
description: Number of most likely tokens to return at each token position when logprobs are enabled
width:
type: integer
description: (Experimental) Width of the generated image in pixels. For image generation models only.
height:
type: integer
description: (Experimental) Height of the generated image in pixels. For image generation models only.
steps:
type: integer
description: (Experimental) Number of diffusion steps. For image generation models only.
GenerateResponse:
type: object
properties:
@@ -161,6 +170,15 @@ components:
items:
$ref: "#/components/schemas/Logprob"
description: Log probability information for the generated tokens when logprobs are enabled
image:
type: string
description: (Experimental) Base64-encoded generated image data. For image generation models only.
completed:
type: integer
description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
total:
type: integer
description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
GenerateStreamEvent:
type: object
properties:
@@ -200,6 +218,15 @@ components:
eval_duration:
type: integer
description: Time spent generating tokens in nanoseconds
image:
type: string
description: (Experimental) Base64-encoded generated image data. For image generation models only.
completed:
type: integer
description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
total:
type: integer
description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
ChatMessage:
type: object
required: [role, content]