Add image generation documentation

- Add image generation capability page with API usage examples - Add image-generation to docs.json navigation - Update openapi.yaml with image generation request/response fields - Request: width, height, steps - Response: image, completed, total
2026-01-22 22:40:07 -05:00 · 2026-01-22 14:09:58 -08:00
7 changed files with 264 additions and 132 deletions
--- a/docs/capabilities/image-generation.mdx
+++ b/docs/capabilities/image-generation.mdx
@@ -0,0 +1,205 @@
+---
+title: Image Generation
+---
+
+<Warning>
+Image generation is experimental and currently only available on macOS. This feature may change in future versions.
+</Warning>
+
+Image generation models create images from text prompts. Ollama supports diffusion-based image generation models through both Ollama's API and OpenAI-compatible endpoints.
+
+## Usage
+
+<Tabs>
+  <Tab title="CLI">
+    ```shell
+    ollama run x/z-image-turbo "a sunset over mountains"
+    ```
+    The generated image will be saved to the current directory.
+  </Tab>
+  <Tab title="cURL">
+    ```shell
+    curl http://localhost:11434/api/generate -d '{
+      "model": "x/z-image-turbo",
+      "prompt": "a sunset over mountains",
+      "stream": false
+    }'
+    ```
+  </Tab>
+  <Tab title="Python">
+    ```python
+    import ollama
+    import base64
+
+    response = ollama.generate(
+        model='x/z-image-turbo',
+        prompt='a sunset over mountains',
+    )
+
+    # Save the generated image
+    with open('output.png', 'wb') as f:
+        f.write(base64.b64decode(response['image']))
+
+    print('Image saved to output.png')
+    ```
+  </Tab>
+  <Tab title="JavaScript">
+    ```javascript
+    import ollama from 'ollama'
+    import { writeFileSync } from 'fs'
+
+    const response = await ollama.generate({
+      model: 'x/z-image-turbo',
+      prompt: 'a sunset over mountains',
+    })
+
+    // Save the generated image
+    const imageBuffer = Buffer.from(response.image, 'base64')
+    writeFileSync('output.png', imageBuffer)
+
+    console.log('Image saved to output.png')
+    ```
+  </Tab>
+</Tabs>
+
+### Response
+
+The response includes an `image` field containing the base64-encoded image data:
+
+```json
+{
+  "model": "x/z-image-turbo",
+  "created_at": "2024-01-15T10:30:15.000000Z",
+  "image": "iVBORw0KGgoAAAANSUhEUg...",
+  "done": true,
+  "done_reason": "stop",
+  "total_duration": 15000000000,
+  "load_duration": 2000000000
+}
+```
+
+## Image dimensions
+
+Customize the output image size using the `width` and `height` parameters:
+
+<Tabs>
+  <Tab title="cURL">
+    ```shell
+    curl http://localhost:11434/api/generate -d '{
+      "model": "x/z-image-turbo",
+      "prompt": "a portrait of a robot artist",
+      "width": 768,
+      "height": 1024,
+      "stream": false
+    }'
+    ```
+  </Tab>
+  <Tab title="Python">
+    ```python
+    import ollama
+
+    response = ollama.generate(
+        model='x/z-image-turbo',
+        prompt='a portrait of a robot artist',
+        width=768,
+        height=1024,
+    )
+    ```
+  </Tab>
+  <Tab title="JavaScript">
+    ```javascript
+    import ollama from 'ollama'
+
+    const response = await ollama.generate({
+      model: 'x/z-image-turbo',
+      prompt: 'a portrait of a robot artist',
+      width: 768,
+      height: 1024,
+    })
+    ```
+  </Tab>
+</Tabs>
+
+## Streaming progress
+
+When streaming is enabled (the default), progress updates are sent during image generation:
+
+```json
+{
+  "model": "x/z-image-turbo",
+  "created_at": "2024-01-15T10:30:00.000000Z",
+  "completed": 5,
+  "total": 20,
+  "done": false
+}
+```
+
+The `completed` and `total` fields indicate the current progress through the diffusion steps.
+
+## Parameters
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `prompt` | Text description of the image to generate | Required |
+| `width` | Width of the generated image in pixels | Model default |
+| `height` | Height of the generated image in pixels | Model default |
+| `steps` | Number of diffusion steps | Model default |
+
+## OpenAI compatibility
+
+Image generation is also available through the OpenAI-compatible `/v1/images/generations` endpoint:
+
+<Tabs>
+  <Tab title="cURL">
+    ```shell
+    curl http://localhost:11434/v1/images/generations \
+      -H "Content-Type: application/json" \
+      -d '{
+        "model": "x/z-image-turbo",
+        "prompt": "a sunset over mountains",
+        "size": "1024x1024",
+        "response_format": "b64_json"
+      }'
+    ```
+  </Tab>
+  <Tab title="Python">
+    ```python
+    from openai import OpenAI
+
+    client = OpenAI(
+        base_url='http://localhost:11434/v1/',
+        api_key='ollama',  # required but ignored
+    )
+
+    response = client.images.generate(
+        model='x/z-image-turbo',
+        prompt='a sunset over mountains',
+        size='1024x1024',
+        response_format='b64_json',
+    )
+
+    print(response.data[0].b64_json[:50] + '...')
+    ```
+  </Tab>
+  <Tab title="JavaScript">
+    ```javascript
+    import OpenAI from 'openai'
+
+    const openai = new OpenAI({
+      baseURL: 'http://localhost:11434/v1/',
+      apiKey: 'ollama', // required but ignored
+    })
+
+    const response = await openai.images.generate({
+      model: 'x/z-image-turbo',
+      prompt: 'a sunset over mountains',
+      size: '1024x1024',
+      response_format: 'b64_json',
+    })
+
+    console.log(response.data[0].b64_json.slice(0, 50) + '...')
+    ```
+  </Tab>
+</Tabs>
+
+See [OpenAI compatibility](/api/openai-compatibility#v1imagesgenerations-experimental) for more details.
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -93,6 +93,7 @@
              "/capabilities/thinking",
              "/capabilities/structured-outputs",
              "/capabilities/vision",
+              "/capabilities/image-generation",
              "/capabilities/embeddings",
              "/capabilities/tool-calling",
              "/capabilities/web-search"
--- a/docs/integrations/claude-code.mdx
+++ b/docs/integrations/claude-code.mdx
@@ -2,7 +2,7 @@
 title: Claude Code
 ---

-Claude Code is Anthropic's agentic coding tool that can read, modify, and execute code in your working directory.
+Claude Code is Anthropic's agentic coding tool that can read, modify, and execute code in your working directory. 

 Open models can be used with Claude Code through Ollama's Anthropic-compatible API, enabling you to use models such as `qwen3-coder`, `gpt-oss:20b`, or other models.

@@ -26,16 +26,6 @@ irm https://claude.ai/install.ps1 | iex

 ## Usage with Ollama

-Configure Claude Code to use Ollama:
-
-```shell
-ollama config claude
-```
-
-This will prompt you to select a model and automatically configure Claude Code to use Ollama.
-
-<Accordion title="Manual Configuration">
-
 Claude Code connects to Ollama using the Anthropic-compatible API.

 1. Set the environment variables:
@@ -57,9 +47,7 @@ Or run with environment variables inline:
 ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 claude --model gpt-oss:20b
 ```

-</Accordion>
-
-<Note>Claude Code requires a large context window. We recommend at least 32K tokens. See the [context length documentation](/context-length) for how to adjust context length in Ollama.</Note>
+**Note:** Claude Code requires a large context window. We recommend at least 32K tokens. See the [context length documentation](/context-length) for how to adjust context length in Ollama.

 ## Connecting to ollama.com

@@ -87,4 +75,4 @@ claude --model glm-4.7:cloud
 ### Local models
 - `qwen3-coder` - Excellent for coding tasks
 - `gpt-oss:20b` - Strong general-purpose model
- `gpt-oss:120b` - Larger general-purpose model for more complex tasks
+- `gpt-oss:120b` - Larger general-purpose model for more complex tasks
--- a/docs/integrations/codex.mdx
+++ b/docs/integrations/codex.mdx
@@ -2,31 +2,22 @@
 title: Codex
 ---

-Codex is OpenAI's agentic coding tool for the command line.

 ## Install

 Install the [Codex CLI](https://developers.openai.com/codex/cli/):

-```shell
+```
 npm install -g @openai/codex
 ```

 ## Usage with Ollama

-Configure Codex to use Ollama:
-
-```shell
-ollama config codex
-```
-
-This will prompt you to select a model and automatically configure Codex to use Ollama.
-
-<Accordion title="Manual Configuration">
+<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>

 To use `codex` with Ollama, use the `--oss` flag:

-```shell
+```
 codex --oss
 ```

@@ -34,22 +25,20 @@ codex --oss

 By default, codex will use the local `gpt-oss:20b` model. However, you can specify a different model with the `-m` flag:

-```shell
+```
 codex --oss -m gpt-oss:120b
 ```

 ### Cloud Models

-```shell
+```
 codex --oss -m gpt-oss:120b-cloud
 ```

-</Accordion>
-
-<Note>Codex requires a larger context window. It is recommended to use a context window of at least 32K tokens.</Note>

 ## Connecting to ollama.com

+
 Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.

 To use ollama.com directly, edit your `~/.codex/config.toml` file to point to ollama.com.
--- a/docs/integrations/droid.mdx
+++ b/docs/integrations/droid.mdx
@@ -2,7 +2,6 @@
 title: Droid
 ---

-Droid is Factory's agentic coding tool for the command line.

 ## Install

@@ -12,80 +11,66 @@ Install the [Droid CLI](https://factory.ai/):
 curl -fsSL https://app.factory.ai/cli | sh
 ```

+<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
+
 ## Usage with Ollama

-Configure Droid to use Ollama:
-
-```shell
-ollama config droid
-```
-
-This will prompt you to select models and automatically configure Droid to use Ollama.
-
-<Accordion title="Manual Configuration">
-
-Add a local configuration block to `~/.factory/settings.json`:
+Add a local configuration block to `~/.factory/config.json`:

 ```json
 {
-  "customModels": [
+  "custom_models": [
    {
+      "model_display_name": "qwen3-coder [Ollama]",
      "model": "qwen3-coder",
-      "displayName": "qwen3-coder [Ollama]",
-      "baseUrl": "http://localhost:11434/v1",
-      "apiKey": "ollama",
+      "base_url": "http://localhost:11434/v1/",
+      "api_key": "not-needed",
      "provider": "generic-chat-completion-api",
-      "maxOutputTokens": 32000
+      "max_tokens": 32000 
    }
  ]
 }
 ```

-Adjust `maxOutputTokens` based on your model's context length (the automated setup detects this automatically).
-
-### Cloud Models

+## Cloud Models
 `qwen3-coder:480b-cloud` is the recommended model for use with Droid.

-Add the cloud configuration block to `~/.factory/settings.json`:
+Add the cloud configuration block to `~/.factory/config.json`:

 ```json
 {
-  "customModels": [
+  "custom_models": [
    {
+      "model_display_name": "qwen3-coder [Ollama Cloud]",
      "model": "qwen3-coder:480b-cloud",
-      "displayName": "qwen3-coder:480b-cloud [Ollama]",
-      "baseUrl": "http://localhost:11434/v1",
-      "apiKey": "ollama",
+      "base_url": "http://localhost:11434/v1/",
+      "api_key": "not-needed",
      "provider": "generic-chat-completion-api",
-      "maxOutputTokens": 128000
+      "max_tokens": 128000
    }
  ]
 }
 ```

-</Accordion>
-
-<Note>Droid requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
-
 ## Connecting to ollama.com

 1. Create an [API key](https://ollama.com/settings/keys) from ollama.com and export it as `OLLAMA_API_KEY`.
-2. Add the cloud configuration block to `~/.factory/settings.json`:
+2. Add the cloud configuration block to `~/.factory/config.json`:

   ```json
   {
-     "customModels": [
+     "custom_models": [
       {
+         "model_display_name": "qwen3-coder [Ollama Cloud]",
         "model": "qwen3-coder:480b",
-         "displayName": "qwen3-coder:480b [Ollama Cloud]",
-         "baseUrl": "https://ollama.com/v1",
-         "apiKey": "OLLAMA_API_KEY",
+         "base_url": "https://ollama.com/v1/",
+         "api_key": "OLLAMA_API_KEY",
         "provider": "generic-chat-completion-api",
-         "maxOutputTokens": 128000
+         "max_tokens": 128000
       }
     ]
   }
   ```

-Run `droid` in a new terminal to load the new settings.
+Run `droid` in a new terminal to load the new settings.
--- a/docs/integrations/opencode.mdx
+++ b/docs/integrations/opencode.mdx
@@ -1,63 +0,0 @@
---
-title: OpenCode
---
-
-OpenCode is an agentic coding tool for the terminal.
-
-## Install
-
-Install [OpenCode](https://opencode.ai):
-
-```shell
-curl -fsSL https://opencode.ai/install | bash
-```
-
-## Usage with Ollama
-
-Configure OpenCode to use Ollama:
-
-```shell
-ollama config opencode
-```
-
-This will prompt you to select models and automatically configure OpenCode to use Ollama.
-
-<Accordion title="Manual Configuration">
-
-Add the Ollama provider to `~/.config/opencode/opencode.json`:
-
-```json
-{
-  "$schema": "https://opencode.ai/config.json",
-  "provider": {
-    "ollama": {
-      "npm": "@ai-sdk/openai-compatible",
-      "name": "Ollama (local)",
-      "options": {
-        "baseURL": "http://localhost:11434/v1"
-      },
-      "models": {
-        "qwen3-coder": {
-          "name": "qwen3-coder [Ollama]"
-        }
-      }
-    }
-  }
-}
-```
-
-</Accordion>
-
-<Note>OpenCode requires a larger context window. It is recommended to use a context window of at least 32K tokens. See [Context length](/context-length) for more information.</Note>
-
-## Recommended Models
-
-### Cloud models
- `qwen3-coder:480b` - Large coding model
- `glm-4.7:cloud` - High-performance cloud model
- `minimax-m2.1:cloud` - Fast cloud model
-
-### Local models
- `qwen3-coder` - Excellent for coding tasks
- `gpt-oss:20b` - Strong general-purpose model
- `gpt-oss:120b` - Larger general-purpose model for more complex tasks
--- a/docs/openapi.yaml
+++ b/docs/openapi.yaml
@@ -117,6 +117,15 @@ components:
        top_logprobs:
          type: integer
          description: Number of most likely tokens to return at each token position when logprobs are enabled
+        width:
+          type: integer
+          description: (Experimental) Width of the generated image in pixels. For image generation models only.
+        height:
+          type: integer
+          description: (Experimental) Height of the generated image in pixels. For image generation models only.
+        steps:
+          type: integer
+          description: (Experimental) Number of diffusion steps. For image generation models only.
    GenerateResponse:
      type: object
      properties:
@@ -161,6 +170,15 @@ components:
          items:
            $ref: "#/components/schemas/Logprob"
          description: Log probability information for the generated tokens when logprobs are enabled
+        image:
+          type: string
+          description: (Experimental) Base64-encoded generated image data. For image generation models only.
+        completed:
+          type: integer
+          description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
+        total:
+          type: integer
+          description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
    GenerateStreamEvent:
      type: object
      properties:
@@ -200,6 +218,15 @@ components:
        eval_duration:
          type: integer
          description: Time spent generating tokens in nanoseconds
+        image:
+          type: string
+          description: (Experimental) Base64-encoded generated image data. For image generation models only.
+        completed:
+          type: integer
+          description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
+        total:
+          type: integer
+          description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
    ChatMessage:
      type: object
      required: [role, content]