Add image generation documentation

- Add image generation capability page with API usage examples - Add image-generation to docs.json navigation - Update openapi.yaml with image generation request/response fields - Request: width, height, steps - Response: image, completed, total
2026-01-22 22:40:07 -05:00 · 2026-01-22 14:09:58 -08:00
3 changed files with 233 additions and 0 deletions
--- a/docs/capabilities/image-generation.mdx
+++ b/docs/capabilities/image-generation.mdx
@@ -0,0 +1,205 @@
+---
+title: Image Generation
+---
+
+<Warning>
+Image generation is experimental and currently only available on macOS. This feature may change in future versions.
+</Warning>
+
+Image generation models create images from text prompts. Ollama supports diffusion-based image generation models through both Ollama's API and OpenAI-compatible endpoints.
+
+## Usage
+
+<Tabs>
+  <Tab title="CLI">
+    ```shell
+    ollama run x/z-image-turbo "a sunset over mountains"
+    ```
+    The generated image will be saved to the current directory.
+  </Tab>
+  <Tab title="cURL">
+    ```shell
+    curl http://localhost:11434/api/generate -d '{
+      "model": "x/z-image-turbo",
+      "prompt": "a sunset over mountains",
+      "stream": false
+    }'
+    ```
+  </Tab>
+  <Tab title="Python">
+    ```python
+    import ollama
+    import base64
+
+    response = ollama.generate(
+        model='x/z-image-turbo',
+        prompt='a sunset over mountains',
+    )
+
+    # Save the generated image
+    with open('output.png', 'wb') as f:
+        f.write(base64.b64decode(response['image']))
+
+    print('Image saved to output.png')
+    ```
+  </Tab>
+  <Tab title="JavaScript">
+    ```javascript
+    import ollama from 'ollama'
+    import { writeFileSync } from 'fs'
+
+    const response = await ollama.generate({
+      model: 'x/z-image-turbo',
+      prompt: 'a sunset over mountains',
+    })
+
+    // Save the generated image
+    const imageBuffer = Buffer.from(response.image, 'base64')
+    writeFileSync('output.png', imageBuffer)
+
+    console.log('Image saved to output.png')
+    ```
+  </Tab>
+</Tabs>
+
+### Response
+
+The response includes an `image` field containing the base64-encoded image data:
+
+```json
+{
+  "model": "x/z-image-turbo",
+  "created_at": "2024-01-15T10:30:15.000000Z",
+  "image": "iVBORw0KGgoAAAANSUhEUg...",
+  "done": true,
+  "done_reason": "stop",
+  "total_duration": 15000000000,
+  "load_duration": 2000000000
+}
+```
+
+## Image dimensions
+
+Customize the output image size using the `width` and `height` parameters:
+
+<Tabs>
+  <Tab title="cURL">
+    ```shell
+    curl http://localhost:11434/api/generate -d '{
+      "model": "x/z-image-turbo",
+      "prompt": "a portrait of a robot artist",
+      "width": 768,
+      "height": 1024,
+      "stream": false
+    }'
+    ```
+  </Tab>
+  <Tab title="Python">
+    ```python
+    import ollama
+
+    response = ollama.generate(
+        model='x/z-image-turbo',
+        prompt='a portrait of a robot artist',
+        width=768,
+        height=1024,
+    )
+    ```
+  </Tab>
+  <Tab title="JavaScript">
+    ```javascript
+    import ollama from 'ollama'
+
+    const response = await ollama.generate({
+      model: 'x/z-image-turbo',
+      prompt: 'a portrait of a robot artist',
+      width: 768,
+      height: 1024,
+    })
+    ```
+  </Tab>
+</Tabs>
+
+## Streaming progress
+
+When streaming is enabled (the default), progress updates are sent during image generation:
+
+```json
+{
+  "model": "x/z-image-turbo",
+  "created_at": "2024-01-15T10:30:00.000000Z",
+  "completed": 5,
+  "total": 20,
+  "done": false
+}
+```
+
+The `completed` and `total` fields indicate the current progress through the diffusion steps.
+
+## Parameters
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `prompt` | Text description of the image to generate | Required |
+| `width` | Width of the generated image in pixels | Model default |
+| `height` | Height of the generated image in pixels | Model default |
+| `steps` | Number of diffusion steps | Model default |
+
+## OpenAI compatibility
+
+Image generation is also available through the OpenAI-compatible `/v1/images/generations` endpoint:
+
+<Tabs>
+  <Tab title="cURL">
+    ```shell
+    curl http://localhost:11434/v1/images/generations \
+      -H "Content-Type: application/json" \
+      -d '{
+        "model": "x/z-image-turbo",
+        "prompt": "a sunset over mountains",
+        "size": "1024x1024",
+        "response_format": "b64_json"
+      }'
+    ```
+  </Tab>
+  <Tab title="Python">
+    ```python
+    from openai import OpenAI
+
+    client = OpenAI(
+        base_url='http://localhost:11434/v1/',
+        api_key='ollama',  # required but ignored
+    )
+
+    response = client.images.generate(
+        model='x/z-image-turbo',
+        prompt='a sunset over mountains',
+        size='1024x1024',
+        response_format='b64_json',
+    )
+
+    print(response.data[0].b64_json[:50] + '...')
+    ```
+  </Tab>
+  <Tab title="JavaScript">
+    ```javascript
+    import OpenAI from 'openai'
+
+    const openai = new OpenAI({
+      baseURL: 'http://localhost:11434/v1/',
+      apiKey: 'ollama', // required but ignored
+    })
+
+    const response = await openai.images.generate({
+      model: 'x/z-image-turbo',
+      prompt: 'a sunset over mountains',
+      size: '1024x1024',
+      response_format: 'b64_json',
+    })
+
+    console.log(response.data[0].b64_json.slice(0, 50) + '...')
+    ```
+  </Tab>
+</Tabs>
+
+See [OpenAI compatibility](/api/openai-compatibility#v1imagesgenerations-experimental) for more details.
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -93,6 +93,7 @@
              "/capabilities/thinking",
              "/capabilities/structured-outputs",
              "/capabilities/vision",
+              "/capabilities/image-generation",
              "/capabilities/embeddings",
              "/capabilities/tool-calling",
              "/capabilities/web-search"
--- a/docs/openapi.yaml
+++ b/docs/openapi.yaml
@@ -117,6 +117,15 @@ components:
        top_logprobs:
          type: integer
          description: Number of most likely tokens to return at each token position when logprobs are enabled
+        width:
+          type: integer
+          description: (Experimental) Width of the generated image in pixels. For image generation models only.
+        height:
+          type: integer
+          description: (Experimental) Height of the generated image in pixels. For image generation models only.
+        steps:
+          type: integer
+          description: (Experimental) Number of diffusion steps. For image generation models only.
    GenerateResponse:
      type: object
      properties:
@@ -161,6 +170,15 @@ components:
          items:
            $ref: "#/components/schemas/Logprob"
          description: Log probability information for the generated tokens when logprobs are enabled
+        image:
+          type: string
+          description: (Experimental) Base64-encoded generated image data. For image generation models only.
+        completed:
+          type: integer
+          description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
+        total:
+          type: integer
+          description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
    GenerateStreamEvent:
      type: object
      properties:
@@ -200,6 +218,15 @@ components:
        eval_duration:
          type: integer
          description: Time spent generating tokens in nanoseconds
+        image:
+          type: string
+          description: (Experimental) Base64-encoded generated image data. For image generation models only.
+        completed:
+          type: integer
+          description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
+        total:
+          type: integer
+          description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
    ChatMessage:
      type: object
      required: [role, content]