mirror of
https://github.com/ollama/ollama.git
synced 2026-01-22 22:40:07 -05:00
Compare commits
1 Commits
brucemacd/
...
ollama-ima
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
8b4410633d |
205
docs/capabilities/image-generation.mdx
Normal file
205
docs/capabilities/image-generation.mdx
Normal file
@@ -0,0 +1,205 @@
|
||||
---
|
||||
title: Image Generation
|
||||
---
|
||||
|
||||
<Warning>
|
||||
Image generation is experimental and currently only available on macOS. This feature may change in future versions.
|
||||
</Warning>
|
||||
|
||||
Image generation models create images from text prompts. Ollama supports diffusion-based image generation models through both Ollama's API and OpenAI-compatible endpoints.
|
||||
|
||||
## Usage
|
||||
|
||||
<Tabs>
|
||||
<Tab title="CLI">
|
||||
```shell
|
||||
ollama run x/z-image-turbo "a sunset over mountains"
|
||||
```
|
||||
The generated image will be saved to the current directory.
|
||||
</Tab>
|
||||
<Tab title="cURL">
|
||||
```shell
|
||||
curl http://localhost:11434/api/generate -d '{
|
||||
"model": "x/z-image-turbo",
|
||||
"prompt": "a sunset over mountains",
|
||||
"stream": false
|
||||
}'
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Python">
|
||||
```python
|
||||
import ollama
|
||||
import base64
|
||||
|
||||
response = ollama.generate(
|
||||
model='x/z-image-turbo',
|
||||
prompt='a sunset over mountains',
|
||||
)
|
||||
|
||||
# Save the generated image
|
||||
with open('output.png', 'wb') as f:
|
||||
f.write(base64.b64decode(response['image']))
|
||||
|
||||
print('Image saved to output.png')
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="JavaScript">
|
||||
```javascript
|
||||
import ollama from 'ollama'
|
||||
import { writeFileSync } from 'fs'
|
||||
|
||||
const response = await ollama.generate({
|
||||
model: 'x/z-image-turbo',
|
||||
prompt: 'a sunset over mountains',
|
||||
})
|
||||
|
||||
// Save the generated image
|
||||
const imageBuffer = Buffer.from(response.image, 'base64')
|
||||
writeFileSync('output.png', imageBuffer)
|
||||
|
||||
console.log('Image saved to output.png')
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
### Response
|
||||
|
||||
The response includes an `image` field containing the base64-encoded image data:
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "x/z-image-turbo",
|
||||
"created_at": "2024-01-15T10:30:15.000000Z",
|
||||
"image": "iVBORw0KGgoAAAANSUhEUg...",
|
||||
"done": true,
|
||||
"done_reason": "stop",
|
||||
"total_duration": 15000000000,
|
||||
"load_duration": 2000000000
|
||||
}
|
||||
```
|
||||
|
||||
## Image dimensions
|
||||
|
||||
Customize the output image size using the `width` and `height` parameters:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="cURL">
|
||||
```shell
|
||||
curl http://localhost:11434/api/generate -d '{
|
||||
"model": "x/z-image-turbo",
|
||||
"prompt": "a portrait of a robot artist",
|
||||
"width": 768,
|
||||
"height": 1024,
|
||||
"stream": false
|
||||
}'
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Python">
|
||||
```python
|
||||
import ollama
|
||||
|
||||
response = ollama.generate(
|
||||
model='x/z-image-turbo',
|
||||
prompt='a portrait of a robot artist',
|
||||
width=768,
|
||||
height=1024,
|
||||
)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="JavaScript">
|
||||
```javascript
|
||||
import ollama from 'ollama'
|
||||
|
||||
const response = await ollama.generate({
|
||||
model: 'x/z-image-turbo',
|
||||
prompt: 'a portrait of a robot artist',
|
||||
width: 768,
|
||||
height: 1024,
|
||||
})
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Streaming progress
|
||||
|
||||
When streaming is enabled (the default), progress updates are sent during image generation:
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "x/z-image-turbo",
|
||||
"created_at": "2024-01-15T10:30:00.000000Z",
|
||||
"completed": 5,
|
||||
"total": 20,
|
||||
"done": false
|
||||
}
|
||||
```
|
||||
|
||||
The `completed` and `total` fields indicate the current progress through the diffusion steps.
|
||||
|
||||
## Parameters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `prompt` | Text description of the image to generate | Required |
|
||||
| `width` | Width of the generated image in pixels | Model default |
|
||||
| `height` | Height of the generated image in pixels | Model default |
|
||||
| `steps` | Number of diffusion steps | Model default |
|
||||
|
||||
## OpenAI compatibility
|
||||
|
||||
Image generation is also available through the OpenAI-compatible `/v1/images/generations` endpoint:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="cURL">
|
||||
```shell
|
||||
curl http://localhost:11434/v1/images/generations \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "x/z-image-turbo",
|
||||
"prompt": "a sunset over mountains",
|
||||
"size": "1024x1024",
|
||||
"response_format": "b64_json"
|
||||
}'
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Python">
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(
|
||||
base_url='http://localhost:11434/v1/',
|
||||
api_key='ollama', # required but ignored
|
||||
)
|
||||
|
||||
response = client.images.generate(
|
||||
model='x/z-image-turbo',
|
||||
prompt='a sunset over mountains',
|
||||
size='1024x1024',
|
||||
response_format='b64_json',
|
||||
)
|
||||
|
||||
print(response.data[0].b64_json[:50] + '...')
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="JavaScript">
|
||||
```javascript
|
||||
import OpenAI from 'openai'
|
||||
|
||||
const openai = new OpenAI({
|
||||
baseURL: 'http://localhost:11434/v1/',
|
||||
apiKey: 'ollama', // required but ignored
|
||||
})
|
||||
|
||||
const response = await openai.images.generate({
|
||||
model: 'x/z-image-turbo',
|
||||
prompt: 'a sunset over mountains',
|
||||
size: '1024x1024',
|
||||
response_format: 'b64_json',
|
||||
})
|
||||
|
||||
console.log(response.data[0].b64_json.slice(0, 50) + '...')
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
See [OpenAI compatibility](/api/openai-compatibility#v1imagesgenerations-experimental) for more details.
|
||||
@@ -93,6 +93,7 @@
|
||||
"/capabilities/thinking",
|
||||
"/capabilities/structured-outputs",
|
||||
"/capabilities/vision",
|
||||
"/capabilities/image-generation",
|
||||
"/capabilities/embeddings",
|
||||
"/capabilities/tool-calling",
|
||||
"/capabilities/web-search"
|
||||
|
||||
@@ -117,6 +117,15 @@ components:
|
||||
top_logprobs:
|
||||
type: integer
|
||||
description: Number of most likely tokens to return at each token position when logprobs are enabled
|
||||
width:
|
||||
type: integer
|
||||
description: (Experimental) Width of the generated image in pixels. For image generation models only.
|
||||
height:
|
||||
type: integer
|
||||
description: (Experimental) Height of the generated image in pixels. For image generation models only.
|
||||
steps:
|
||||
type: integer
|
||||
description: (Experimental) Number of diffusion steps. For image generation models only.
|
||||
GenerateResponse:
|
||||
type: object
|
||||
properties:
|
||||
@@ -161,6 +170,15 @@ components:
|
||||
items:
|
||||
$ref: "#/components/schemas/Logprob"
|
||||
description: Log probability information for the generated tokens when logprobs are enabled
|
||||
image:
|
||||
type: string
|
||||
description: (Experimental) Base64-encoded generated image data. For image generation models only.
|
||||
completed:
|
||||
type: integer
|
||||
description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
|
||||
total:
|
||||
type: integer
|
||||
description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
|
||||
GenerateStreamEvent:
|
||||
type: object
|
||||
properties:
|
||||
@@ -200,6 +218,15 @@ components:
|
||||
eval_duration:
|
||||
type: integer
|
||||
description: Time spent generating tokens in nanoseconds
|
||||
image:
|
||||
type: string
|
||||
description: (Experimental) Base64-encoded generated image data. For image generation models only.
|
||||
completed:
|
||||
type: integer
|
||||
description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
|
||||
total:
|
||||
type: integer
|
||||
description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
|
||||
ChatMessage:
|
||||
type: object
|
||||
required: [role, content]
|
||||
|
||||
Reference in New Issue
Block a user