Compare commits

...

1 Commits

Author SHA1 Message Date
jmorganca
8b4410633d Add image generation documentation
- Add image generation capability page with API usage examples
- Add image-generation to docs.json navigation
- Update openapi.yaml with image generation request/response fields
  - Request: width, height, steps
  - Response: image, completed, total
2026-01-22 14:09:58 -08:00
3 changed files with 233 additions and 0 deletions

View File

@@ -0,0 +1,205 @@
---
title: Image Generation
---
<Warning>
Image generation is experimental and currently only available on macOS. This feature may change in future versions.
</Warning>
Image generation models create images from text prompts. Ollama supports diffusion-based image generation models through both Ollama's API and OpenAI-compatible endpoints.
## Usage
<Tabs>
<Tab title="CLI">
```shell
ollama run x/z-image-turbo "a sunset over mountains"
```
The generated image will be saved to the current directory.
</Tab>
<Tab title="cURL">
```shell
curl http://localhost:11434/api/generate -d '{
"model": "x/z-image-turbo",
"prompt": "a sunset over mountains",
"stream": false
}'
```
</Tab>
<Tab title="Python">
```python
import ollama
import base64
response = ollama.generate(
model='x/z-image-turbo',
prompt='a sunset over mountains',
)
# Save the generated image
with open('output.png', 'wb') as f:
f.write(base64.b64decode(response['image']))
print('Image saved to output.png')
```
</Tab>
<Tab title="JavaScript">
```javascript
import ollama from 'ollama'
import { writeFileSync } from 'fs'
const response = await ollama.generate({
model: 'x/z-image-turbo',
prompt: 'a sunset over mountains',
})
// Save the generated image
const imageBuffer = Buffer.from(response.image, 'base64')
writeFileSync('output.png', imageBuffer)
console.log('Image saved to output.png')
```
</Tab>
</Tabs>
### Response
The response includes an `image` field containing the base64-encoded image data:
```json
{
"model": "x/z-image-turbo",
"created_at": "2024-01-15T10:30:15.000000Z",
"image": "iVBORw0KGgoAAAANSUhEUg...",
"done": true,
"done_reason": "stop",
"total_duration": 15000000000,
"load_duration": 2000000000
}
```
## Image dimensions
Customize the output image size using the `width` and `height` parameters:
<Tabs>
<Tab title="cURL">
```shell
curl http://localhost:11434/api/generate -d '{
"model": "x/z-image-turbo",
"prompt": "a portrait of a robot artist",
"width": 768,
"height": 1024,
"stream": false
}'
```
</Tab>
<Tab title="Python">
```python
import ollama
response = ollama.generate(
model='x/z-image-turbo',
prompt='a portrait of a robot artist',
width=768,
height=1024,
)
```
</Tab>
<Tab title="JavaScript">
```javascript
import ollama from 'ollama'
const response = await ollama.generate({
model: 'x/z-image-turbo',
prompt: 'a portrait of a robot artist',
width: 768,
height: 1024,
})
```
</Tab>
</Tabs>
## Streaming progress
When streaming is enabled (the default), progress updates are sent during image generation:
```json
{
"model": "x/z-image-turbo",
"created_at": "2024-01-15T10:30:00.000000Z",
"completed": 5,
"total": 20,
"done": false
}
```
The `completed` and `total` fields indicate the current progress through the diffusion steps.
## Parameters
| Parameter | Description | Default |
|-----------|-------------|---------|
| `prompt` | Text description of the image to generate | Required |
| `width` | Width of the generated image in pixels | Model default |
| `height` | Height of the generated image in pixels | Model default |
| `steps` | Number of diffusion steps | Model default |
## OpenAI compatibility
Image generation is also available through the OpenAI-compatible `/v1/images/generations` endpoint:
<Tabs>
<Tab title="cURL">
```shell
curl http://localhost:11434/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "x/z-image-turbo",
"prompt": "a sunset over mountains",
"size": "1024x1024",
"response_format": "b64_json"
}'
```
</Tab>
<Tab title="Python">
```python
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
api_key='ollama', # required but ignored
)
response = client.images.generate(
model='x/z-image-turbo',
prompt='a sunset over mountains',
size='1024x1024',
response_format='b64_json',
)
print(response.data[0].b64_json[:50] + '...')
```
</Tab>
<Tab title="JavaScript">
```javascript
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1/',
apiKey: 'ollama', // required but ignored
})
const response = await openai.images.generate({
model: 'x/z-image-turbo',
prompt: 'a sunset over mountains',
size: '1024x1024',
response_format: 'b64_json',
})
console.log(response.data[0].b64_json.slice(0, 50) + '...')
```
</Tab>
</Tabs>
See [OpenAI compatibility](/api/openai-compatibility#v1imagesgenerations-experimental) for more details.

View File

@@ -93,6 +93,7 @@
"/capabilities/thinking",
"/capabilities/structured-outputs",
"/capabilities/vision",
"/capabilities/image-generation",
"/capabilities/embeddings",
"/capabilities/tool-calling",
"/capabilities/web-search"

View File

@@ -117,6 +117,15 @@ components:
top_logprobs:
type: integer
description: Number of most likely tokens to return at each token position when logprobs are enabled
width:
type: integer
description: (Experimental) Width of the generated image in pixels. For image generation models only.
height:
type: integer
description: (Experimental) Height of the generated image in pixels. For image generation models only.
steps:
type: integer
description: (Experimental) Number of diffusion steps. For image generation models only.
GenerateResponse:
type: object
properties:
@@ -161,6 +170,15 @@ components:
items:
$ref: "#/components/schemas/Logprob"
description: Log probability information for the generated tokens when logprobs are enabled
image:
type: string
description: (Experimental) Base64-encoded generated image data. For image generation models only.
completed:
type: integer
description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
total:
type: integer
description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
GenerateStreamEvent:
type: object
properties:
@@ -200,6 +218,15 @@ components:
eval_duration:
type: integer
description: Time spent generating tokens in nanoseconds
image:
type: string
description: (Experimental) Base64-encoded generated image data. For image generation models only.
completed:
type: integer
description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
total:
type: integer
description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
ChatMessage:
type: object
required: [role, content]