mirror of
https://github.com/mudler/LocalAI.git
synced 2026-04-17 13:28:31 -04:00
224 lines
7.2 KiB
Markdown
224 lines
7.2 KiB
Markdown
+++
|
|
title = "API Discovery & Instructions"
|
|
weight = 27
|
|
toc = true
|
|
description = "Programmatic API discovery for agents, tools, and automation"
|
|
tags = ["API", "Agents", "Instructions", "Configuration", "Advanced"]
|
|
categories = ["Features"]
|
|
+++
|
|
|
|
LocalAI exposes a set of discovery endpoints that let external agents, coding assistants, and automation tools programmatically learn what the instance can do and how to control it — without reading documentation ahead of time.
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
# 1. Discover what's available
|
|
curl http://localhost:8080/.well-known/localai.json
|
|
|
|
# 2. Browse instruction areas
|
|
curl http://localhost:8080/api/instructions
|
|
|
|
# 3. Get an API guide for a specific instruction
|
|
curl http://localhost:8080/api/instructions/config-management
|
|
```
|
|
|
|
## Well-Known Discovery Endpoint
|
|
|
|
`GET /.well-known/localai.json`
|
|
|
|
Returns the instance version, all available endpoint URLs (flat and categorized), and runtime capabilities.
|
|
|
|
**Example response (abbreviated):**
|
|
|
|
```json
|
|
{
|
|
"version": "v2.28.0",
|
|
"endpoints": {
|
|
"chat_completions": "/v1/chat/completions",
|
|
"models": "/v1/models",
|
|
"config_metadata": "/api/models/config-metadata",
|
|
"instructions": "/api/instructions",
|
|
"swagger": "/swagger/index.html"
|
|
},
|
|
"endpoint_groups": {
|
|
"openai_compatible": { "chat_completions": "/v1/chat/completions", "..." : "..." },
|
|
"config_management": { "config_metadata": "/api/models/config-metadata", "..." : "..." },
|
|
"model_management": { "..." : "..." },
|
|
"monitoring": { "..." : "..." }
|
|
},
|
|
"capabilities": {
|
|
"config_metadata": true,
|
|
"config_patch": true,
|
|
"vram_estimate": true,
|
|
"mcp": true,
|
|
"agents": false,
|
|
"p2p": false
|
|
}
|
|
}
|
|
```
|
|
|
|
The `capabilities` object reflects the current runtime configuration — for example, `mcp` is only `true` if MCP is enabled, and `agents` is `true` only if the agent pool is running.
|
|
|
|
## Instructions API
|
|
|
|
Instructions are curated groups of related API endpoints. Each instruction maps to one or more Swagger tags and provides a focused, LLM-readable guide.
|
|
|
|
### List all instructions
|
|
|
|
`GET /api/instructions`
|
|
|
|
```bash
|
|
curl http://localhost:8080/api/instructions
|
|
```
|
|
|
|
Returns a compact list of instruction areas:
|
|
|
|
```json
|
|
{
|
|
"instructions": [
|
|
{
|
|
"name": "chat-inference",
|
|
"description": "OpenAI-compatible chat completions, text completions, and embeddings",
|
|
"tags": ["inference", "embeddings"],
|
|
"url": "/api/instructions/chat-inference"
|
|
},
|
|
{
|
|
"name": "config-management",
|
|
"description": "Discover, read, and modify model configuration fields with VRAM estimation",
|
|
"tags": ["config"],
|
|
"url": "/api/instructions/config-management"
|
|
}
|
|
],
|
|
"hint": "Fetch GET {url} for a markdown API guide. Add ?format=json for a raw OpenAPI fragment."
|
|
}
|
|
```
|
|
|
|
**Available instructions:**
|
|
|
|
| Instruction | Description |
|
|
|-------------|-------------|
|
|
| `chat-inference` | Chat completions, text completions, embeddings (OpenAI-compatible) |
|
|
| `audio` | Text-to-speech, transcription, voice activity detection, sound generation |
|
|
| `images` | Image generation and inpainting |
|
|
| `model-management` | Browse gallery, install, delete, manage models and backends |
|
|
| `config-management` | Discover, read, and modify model config fields with VRAM estimation |
|
|
| `monitoring` | System metrics, backend status, system information |
|
|
| `mcp` | Model Context Protocol — tool-augmented chat with MCP servers |
|
|
| `agents` | Agent task and job management |
|
|
| `video` | Video generation from text prompts |
|
|
|
|
### Get an instruction guide
|
|
|
|
`GET /api/instructions/:name`
|
|
|
|
By default, returns a **markdown guide** suitable for LLMs and humans:
|
|
|
|
```bash
|
|
curl http://localhost:8080/api/instructions/config-management
|
|
```
|
|
|
|
Add `?format=json` to get a raw **OpenAPI fragment** (filtered Swagger spec with only the relevant paths and definitions):
|
|
|
|
```bash
|
|
curl http://localhost:8080/api/instructions/config-management?format=json
|
|
```
|
|
|
|
## Configuration Management APIs
|
|
|
|
These endpoints let agents discover model configuration fields, read current settings, modify them, and estimate VRAM usage.
|
|
|
|
### Config metadata
|
|
|
|
`GET /api/models/config-metadata`
|
|
|
|
Returns structured metadata for all model configuration fields, organized by section. Each field includes its YAML path, Go type, UI type, label, description, default value, validation constraints, and available options.
|
|
|
|
```bash
|
|
# All fields
|
|
curl http://localhost:8080/api/models/config-metadata
|
|
|
|
# Filter by section
|
|
curl http://localhost:8080/api/models/config-metadata?section=parameters
|
|
```
|
|
|
|
### Autocomplete values
|
|
|
|
`GET /api/models/config-metadata/autocomplete/:provider`
|
|
|
|
Returns runtime values for dynamic fields. Providers include `backends`, `models`, `models:chat`, `models:tts`, `models:transcript`, `models:vad`.
|
|
|
|
```bash
|
|
# List available backends
|
|
curl http://localhost:8080/api/models/config-metadata/autocomplete/backends
|
|
|
|
# List chat-capable models
|
|
curl http://localhost:8080/api/models/config-metadata/autocomplete/models:chat
|
|
```
|
|
|
|
### Read model config
|
|
|
|
`GET /api/models/config-json/:name`
|
|
|
|
Returns the full model configuration as JSON:
|
|
|
|
```bash
|
|
curl http://localhost:8080/api/models/config-json/my-model
|
|
```
|
|
|
|
### Update model config
|
|
|
|
`PATCH /api/models/config-json/:name`
|
|
|
|
Deep-merges a JSON patch into the existing model configuration. Only include the fields you want to change:
|
|
|
|
```bash
|
|
curl -X PATCH http://localhost:8080/api/models/config-json/my-model \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"context_size": 16384, "gpu_layers": 40}'
|
|
```
|
|
|
|
The endpoint validates the merged config and writes it to disk as YAML.
|
|
|
|
{{% notice context="warning" %}}
|
|
Config management endpoints require **admin authentication** when API keys are configured. The discovery and instructions endpoints are unauthenticated.
|
|
{{% /notice %}}
|
|
|
|
### VRAM estimation
|
|
|
|
`POST /api/models/vram-estimate`
|
|
|
|
Estimates VRAM usage for an installed model based on its weight files, context size, and GPU layer offloading:
|
|
|
|
```bash
|
|
curl -X POST http://localhost:8080/api/models/vram-estimate \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"model": "my-model", "context_size": 8192}'
|
|
```
|
|
|
|
```json
|
|
{
|
|
"sizeBytes": 4368438272,
|
|
"sizeDisplay": "4.4 GB",
|
|
"vramBytes": 6123456789,
|
|
"vramDisplay": "6.1 GB",
|
|
"context_note": "Estimate used default context_size=8192. The model's trained maximum context is 131072; VRAM usage will be higher at larger context sizes.",
|
|
"model_max_context": 131072
|
|
}
|
|
```
|
|
|
|
Optional parameters: `gpu_layers` (number of layers to offload, 0 = all), `kv_quant_bits` (KV cache quantization, 0 = fp16).
|
|
|
|
## Integration guide
|
|
|
|
A recommended workflow for agent/tool builders:
|
|
|
|
1. **Discover**: Fetch `/.well-known/localai.json` to learn available endpoints and capabilities
|
|
2. **Browse instructions**: Fetch `/api/instructions` for an overview of instruction areas
|
|
3. **Deep dive**: Fetch `/api/instructions/:name` for a markdown API guide on a specific area
|
|
4. **Explore config**: Use `/api/models/config-metadata` to understand configuration fields
|
|
5. **Interact**: Use the standard OpenAI-compatible endpoints for inference, and the config management endpoints for runtime tuning
|
|
|
|
## Swagger UI
|
|
|
|
The full interactive API documentation is available at `/swagger/index.html`. All annotated endpoints can be explored and tested directly from the browser.
|