LocalAI/docs/content/features/api-discovery.md

+++
title = "API Discovery & Instructions"
weight = 27
toc = true
description = "Programmatic API discovery for agents, tools, and automation"
tags = ["API", "Agents", "Instructions", "Configuration", "Advanced"]
categories = ["Features"]
+++

LocalAI exposes a set of discovery endpoints that let external agents, coding assistants, and automation tools programmatically learn what the instance can do and how to control it — without reading documentation ahead of time.

## Quick start

```bash
# 1. Discover what's available
curl http://localhost:8080/.well-known/localai.json

# 2. Browse instruction areas
curl http://localhost:8080/api/instructions

# 3. Get an API guide for a specific instruction
curl http://localhost:8080/api/instructions/config-management
```

## Well-Known Discovery Endpoint

`GET /.well-known/localai.json`

Returns the instance version, all available endpoint URLs (flat and categorized), and runtime capabilities.

**Example response (abbreviated):**

```json
{
  "version": "v2.28.0",
  "endpoints": {
    "chat_completions": "/v1/chat/completions",
    "models": "/v1/models",
    "config_metadata": "/api/models/config-metadata",
    "instructions": "/api/instructions",
    "swagger": "/swagger/index.html"
  },
  "endpoint_groups": {
    "openai_compatible": { "chat_completions": "/v1/chat/completions", "..." : "..." },
    "config_management": { "config_metadata": "/api/models/config-metadata", "..." : "..." },
    "model_management": { "..." : "..." },
    "monitoring": { "..." : "..." }
  },
  "capabilities": {
    "config_metadata": true,
    "config_patch": true,
    "vram_estimate": true,
    "mcp": true,
    "agents": false,
    "p2p": false
  }
}
```

The `capabilities` object reflects the current runtime configuration — for example, `mcp` is only `true` if MCP is enabled, and `agents` is `true` only if the agent pool is running.

## Instructions API

Instructions are curated groups of related API endpoints. Each instruction maps to one or more Swagger tags and provides a focused, LLM-readable guide.

### List all instructions

`GET /api/instructions`

```bash
curl http://localhost:8080/api/instructions
```

Returns a compact list of instruction areas:

```json
{
  "instructions": [
    {
      "name": "chat-inference",
      "description": "OpenAI-compatible chat completions, text completions, and embeddings",
      "tags": ["inference", "embeddings"],
      "url": "/api/instructions/chat-inference"
    },
    {
      "name": "config-management",
      "description": "Discover, read, and modify model configuration fields with VRAM estimation",
      "tags": ["config"],
      "url": "/api/instructions/config-management"
    }
  ],
  "hint": "Fetch GET {url} for a markdown API guide. Add ?format=json for a raw OpenAPI fragment."
}
```

**Available instructions:**

| Instruction | Description |
|-------------|-------------|
| `chat-inference` | Chat completions, text completions, embeddings (OpenAI-compatible) |
| `audio` | Text-to-speech, transcription, voice activity detection, sound generation |
| `images` | Image generation and inpainting |
| `model-management` | Browse gallery, install, delete, manage models and backends |
| `config-management` | Discover, read, and modify model config fields with VRAM estimation |
| `monitoring` | System metrics, backend status, system information |
| `mcp` | Model Context Protocol — tool-augmented chat with MCP servers |
| `agents` | Agent task and job management |
| `video` | Video generation from text prompts |

### Get an instruction guide

`GET /api/instructions/:name`

By default, returns a **markdown guide** suitable for LLMs and humans:

```bash
curl http://localhost:8080/api/instructions/config-management
```

Add `?format=json` to get a raw **OpenAPI fragment** (filtered Swagger spec with only the relevant paths and definitions):

```bash
curl http://localhost:8080/api/instructions/config-management?format=json
```

## Configuration Management APIs

These endpoints let agents discover model configuration fields, read current settings, modify them, and estimate VRAM usage.

### Config metadata

`GET /api/models/config-metadata`

Returns structured metadata for all model configuration fields, organized by section. Each field includes its YAML path, Go type, UI type, label, description, default value, validation constraints, and available options.

```bash
# All fields
curl http://localhost:8080/api/models/config-metadata

# Filter by section
curl http://localhost:8080/api/models/config-metadata?section=parameters
```

### Autocomplete values

`GET /api/models/config-metadata/autocomplete/:provider`

Returns runtime values for dynamic fields. Providers include `backends`, `models`, `models:chat`, `models:tts`, `models:transcript`, `models:vad`.

```bash
# List available backends
curl http://localhost:8080/api/models/config-metadata/autocomplete/backends

# List chat-capable models
curl http://localhost:8080/api/models/config-metadata/autocomplete/models:chat
```

### Read model config

`GET /api/models/config-json/:name`

Returns the full model configuration as JSON:

```bash
curl http://localhost:8080/api/models/config-json/my-model
```

### Update model config

`PATCH /api/models/config-json/:name`

Deep-merges a JSON patch into the existing model configuration. Only include the fields you want to change:

```bash
curl -X PATCH http://localhost:8080/api/models/config-json/my-model \
  -H "Content-Type: application/json" \
  -d '{"context_size": 16384, "gpu_layers": 40}'
```

The endpoint validates the merged config and writes it to disk as YAML.

{{% notice context="warning" %}}
Config management endpoints require **admin authentication** when API keys are configured. The discovery and instructions endpoints are unauthenticated.
{{% /notice %}}

### VRAM estimation

`POST /api/models/vram-estimate`

Estimates VRAM usage for an installed model based on its weight files, context size, and GPU layer offloading:

```bash
curl -X POST http://localhost:8080/api/models/vram-estimate \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model", "context_size": 8192}'
```

```json
{
  "sizeBytes": 4368438272,
  "sizeDisplay": "4.4 GB",
  "vramBytes": 6123456789,
  "vramDisplay": "6.1 GB",
  "context_note": "Estimate used default context_size=8192. The model's trained maximum context is 131072; VRAM usage will be higher at larger context sizes.",
  "model_max_context": 131072
}
```

Optional parameters: `gpu_layers` (number of layers to offload, 0 = all), `kv_quant_bits` (KV cache quantization, 0 = fp16).

## Integration guide

A recommended workflow for agent/tool builders:

1. **Discover**: Fetch `/.well-known/localai.json` to learn available endpoints and capabilities
2. **Browse instructions**: Fetch `/api/instructions` for an overview of instruction areas
3. **Deep dive**: Fetch `/api/instructions/:name` for a markdown API guide on a specific area
4. **Explore config**: Use `/api/models/config-metadata` to understand configuration fields
5. **Interact**: Use the standard OpenAI-compatible endpoints for inference, and the config management endpoints for runtime tuning

## Swagger UI

The full interactive API documentation is available at `/swagger/index.html`. All annotated endpoints can be explored and tested directly from the browser.