feat: Add documentation for undocumented API endpoints (#8852)

* feat: add documentation for undocumented API endpoints

Creates comprehensive documentation for 8 previously undocumented endpoints:
- Voice Activity Detection (/v1/vad)
- Video Generation (/video)
- Sound Generation (/v1/sound-generation)
- Backend Monitor (/backend/monitor, /backend/shutdown)
- Token Metrics (/tokenMetrics)
- P2P endpoints (/api/p2p/* - 5 sub-endpoints)
- System Info (/system, /version)

Each documentation file includes HTTP method, request/response schemas,
curl examples, sample JSON responses, and error codes.

* docs: remove token-metrics endpoint documentation per review feedback

The token-metrics endpoint is not wired into the HTTP router and
should not be documented per reviewer request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: move system-info documentation to reference section

Per review feedback, system-info endpoint docs are better suited
for the reference section rather than features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
LocalAI [bot]
2026-03-08 17:59:33 +01:00
committed by GitHub
parent ec8f2d7683
commit 9090bca920
7 changed files with 665 additions and 0 deletions

View File

@@ -14,6 +14,10 @@ LocalAI provides a comprehensive set of features for running AI models locally.
- **[Text Generation](text-generation/)** - Generate text with GPT-compatible models using various backends
- **[Image Generation](image-generation/)** - Create images with Stable Diffusion and other diffusion models
- **[Audio Processing](audio-to-text/)** - Transcribe audio to text and generate speech from text
- **[Text to Audio](text-to-audio/)** - Generate speech from text with TTS models
- **[Sound Generation](sound-generation/)** - Generate music and sound effects from text descriptions
- **[Voice Activity Detection](voice-activity-detection/)** - Detect speech segments in audio data
- **[Video Generation](video-generation/)** - Generate videos from text prompts and reference images
- **[Embeddings](embeddings/)** - Generate vector embeddings for semantic search and RAG applications
- **[GPT Vision](gpt-vision/)** - Analyze and understand images with vision-language models
@@ -24,6 +28,7 @@ LocalAI provides a comprehensive set of features for running AI models locally.
- **[Constrained Grammars](constrained_grammars/)** - Control model output format with BNF grammars
- **[GPU Acceleration](GPU-acceleration/)** - Optimize performance with GPU support
- **[Distributed Inference](distributed_inferencing/)** - Scale inference across multiple nodes
- **[P2P API](p2p/)** - Monitor and manage P2P worker and federated nodes
- **[Model Context Protocol (MCP)](mcp/)** - Enable agentic capabilities with MCP integration
- **[Agents](agents/)** - Autonomous AI agents with tools, knowledge base, and skills
@@ -34,6 +39,7 @@ LocalAI provides a comprehensive set of features for running AI models locally.
- **[Stores](stores/)** - Vector similarity search for embeddings
- **[Model Gallery](model-gallery/)** - Browse and install pre-configured models
- **[Backends](backends/)** - Learn about available backends and how to manage them
- **[Backend Monitor](backend-monitor/)** - Monitor backend status and resource usage
- **[Runtime Settings](runtime-settings/)** - Configure application settings via web UI without restarting
## Getting Started

View File

@@ -0,0 +1,93 @@
+++
disableToc = false
title = "Backend Monitor"
weight = 20
url = "/features/backend-monitor/"
+++
LocalAI provides endpoints to monitor and manage running backends. The `/backend/monitor` endpoint reports the status and resource usage of loaded models, and `/backend/shutdown` allows stopping a model's backend process.
## Monitor API
- **Method:** `GET`
- **Endpoints:** `/backend/monitor`, `/v1/backend/monitor`
### Request
The request body is JSON:
| Parameter | Type | Required | Description |
|-----------|----------|----------|--------------------------------|
| `model` | `string` | Yes | Name of the model to monitor |
### Response
Returns a JSON object with the backend status:
| Field | Type | Description |
|----------------------|----------|-------------------------------------------------------|
| `state` | `int` | Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error |
| `memory` | `object` | Memory usage information |
| `memory.total` | `uint64` | Total memory usage in bytes |
| `memory.breakdown` | `object` | Per-component memory breakdown (key-value pairs) |
If the gRPC status call fails, the endpoint falls back to local process metrics:
| Field | Type | Description |
|------------------|---------|--------------------------------|
| `memory_info` | `object`| Process memory info (RSS, VMS) |
| `memory_percent` | `float` | Memory usage percentage |
| `cpu_percent` | `float` | CPU usage percentage |
### Usage
```bash
curl http://localhost:8080/backend/monitor \
-H "Content-Type: application/json" \
-d '{"model": "my-model"}'
```
### Example response
```json
{
"state": 2,
"memory": {
"total": 1073741824,
"breakdown": {
"weights": 536870912,
"kv_cache": 268435456
}
}
}
```
## Shutdown API
- **Method:** `POST`
- **Endpoints:** `/backend/shutdown`, `/v1/backend/shutdown`
### Request
| Parameter | Type | Required | Description |
|-----------|----------|----------|---------------------------------|
| `model` | `string` | Yes | Name of the model to shut down |
### Usage
```bash
curl -X POST http://localhost:8080/backend/shutdown \
-H "Content-Type: application/json" \
-d '{"model": "my-model"}'
```
### Response
Returns `200 OK` with the shutdown confirmation message on success.
## Error Responses
| Status Code | Description |
|-------------|------------------------------------------------|
| 400 | Invalid or missing model name |
| 500 | Backend error or model not loaded |

View File

@@ -0,0 +1,175 @@
+++
disableToc = false
title = "P2P API"
weight = 22
url = "/features/p2p/"
+++
LocalAI supports peer-to-peer (P2P) networking for distributed inference. The P2P API endpoints allow you to monitor connected worker and federated nodes, retrieve the P2P network token, and get cluster statistics.
For an overview of distributed inference setup, see [Distributed Inference](/features/distributed_inferencing/).
## Endpoints
### List all P2P nodes
- **Method:** `GET`
- **Endpoint:** `/api/p2p`
Returns all worker and federated nodes in the P2P network.
#### Response
| Field | Type | Description |
|--------------------|---------|--------------------------------------|
| `nodes` | `array` | List of worker nodes |
| `federated_nodes` | `array` | List of federated nodes |
Each node object:
| Field | Type | Description |
|------------------|----------|------------------------------------------|
| `Name` | `string` | Node name |
| `ID` | `string` | Unique node identifier |
| `TunnelAddress` | `string` | Network tunnel address |
| `ServiceID` | `string` | Service identifier |
| `LastSeen` | `string` | ISO 8601 timestamp of last heartbeat |
#### Usage
```bash
curl http://localhost:8080/api/p2p
```
#### Example response
```json
{
"nodes": [
{
"Name": "worker-1",
"ID": "abc123",
"TunnelAddress": "192.168.1.10:9090",
"ServiceID": "worker",
"LastSeen": "2025-01-15T10:30:00Z"
}
],
"federated_nodes": [
{
"Name": "federation-1",
"ID": "def456",
"TunnelAddress": "192.168.1.20:9090",
"ServiceID": "federated",
"LastSeen": "2025-01-15T10:30:05Z"
}
]
}
```
---
### Get P2P token
- **Method:** `GET`
- **Endpoint:** `/api/p2p/token`
Returns the P2P network token used for node authentication.
#### Usage
```bash
curl http://localhost:8080/api/p2p/token
```
#### Response
Returns the token as a plain text string.
---
### List worker nodes
- **Method:** `GET`
- **Endpoint:** `/api/p2p/workers`
Returns worker nodes with online status.
#### Response
| Field | Type | Description |
|--------------------------|----------|--------------------------------------|
| `nodes` | `array` | List of worker nodes |
| `nodes[].name` | `string` | Node name |
| `nodes[].id` | `string` | Unique node identifier |
| `nodes[].tunnelAddress` | `string` | Network tunnel address |
| `nodes[].serviceID` | `string` | Service identifier |
| `nodes[].lastSeen` | `string` | Last heartbeat timestamp |
| `nodes[].isOnline` | `bool` | Whether the node is currently online |
A node is considered online if it was last seen within the past 40 seconds.
#### Usage
```bash
curl http://localhost:8080/api/p2p/workers
```
---
### List federated nodes
- **Method:** `GET`
- **Endpoint:** `/api/p2p/federation`
Returns federated nodes with online status. Same response format as `/api/p2p/workers`.
#### Usage
```bash
curl http://localhost:8080/api/p2p/federation
```
---
### Get P2P statistics
- **Method:** `GET`
- **Endpoint:** `/api/p2p/stats`
Returns aggregate statistics about the P2P cluster.
#### Response
| Field | Type | Description |
|--------------------|----------|-----------------------------------|
| `workers.online` | `int` | Number of online worker nodes |
| `workers.total` | `int` | Total worker nodes |
| `federated.online` | `int` | Number of online federated nodes |
| `federated.total` | `int` | Total federated nodes |
#### Usage
```bash
curl http://localhost:8080/api/p2p/stats
```
#### Example response
```json
{
"workers": {
"online": 3,
"total": 5
},
"federated": {
"online": 2,
"total": 2
}
}
```
## Error Responses
| Status Code | Description |
|-------------|---------------------------------------------|
| 500 | P2P subsystem not available or internal error |

View File

@@ -0,0 +1,104 @@
+++
disableToc = false
title = "Sound Generation"
weight = 19
url = "/features/sound-generation/"
+++
LocalAI supports generating audio from text descriptions via the `/v1/sound-generation` endpoint. This endpoint is compatible with the [ElevenLabs sound generation API](https://elevenlabs.io/docs/api-reference/sound-generation) and can produce music, sound effects, and other audio content.
## API
- **Method:** `POST`
- **Endpoint:** `/v1/sound-generation`
### Request
The request body is JSON. There are two usage modes: simple and advanced.
#### Simple mode
| Parameter | Type | Required | Description |
|------------------|----------|----------|----------------------------------------------|
| `model_id` | `string` | Yes | Model identifier |
| `text` | `string` | Yes | Audio description or prompt |
| `instrumental` | `bool` | No | Generate instrumental audio (no vocals) |
| `vocal_language` | `string` | No | Language code for vocals (e.g. `bn`, `ja`) |
#### Advanced mode
| Parameter | Type | Required | Description |
|---------------------|----------|----------|-------------------------------------------------|
| `model_id` | `string` | Yes | Model identifier |
| `text` | `string` | Yes | Text prompt or description |
| `duration_seconds` | `float` | No | Target duration in seconds |
| `prompt_influence` | `float` | No | Temperature / prompt influence parameter |
| `do_sample` | `bool` | No | Enable sampling |
| `think` | `bool` | No | Enable extended thinking for generation |
| `caption` | `string` | No | Caption describing the audio |
| `lyrics` | `string` | No | Lyrics for the generated audio |
| `bpm` | `int` | No | Beats per minute |
| `keyscale` | `string` | No | Musical key/scale (e.g. `Ab major`) |
| `language` | `string` | No | Language code |
| `vocal_language` | `string` | No | Vocal language (fallback if `language` is empty) |
| `timesignature` | `string` | No | Time signature (e.g. `4`) |
| `instrumental` | `bool` | No | Generate instrumental audio (no vocals) |
### Response
Returns a binary audio file with the appropriate `Content-Type` header (e.g. `audio/wav`, `audio/mpeg`, `audio/flac`, `audio/ogg`).
## Usage
### Generate a sound effect
```bash
curl http://localhost:8080/v1/sound-generation \
-H "Content-Type: application/json" \
-d '{
"model_id": "sound-model",
"text": "rain falling on a tin roof"
}' \
--output rain.wav
```
### Generate a song with vocals
```bash
curl http://localhost:8080/v1/sound-generation \
-H "Content-Type: application/json" \
-d '{
"model_id": "sound-model",
"text": "a soft Bengali love song for a quiet evening",
"instrumental": false,
"vocal_language": "bn"
}' \
--output song.wav
```
### Generate music with advanced parameters
```bash
curl http://localhost:8080/v1/sound-generation \
-H "Content-Type: application/json" \
-d '{
"model_id": "sound-model",
"text": "upbeat pop",
"caption": "A funky Japanese disco track",
"lyrics": "[Verse 1]\nDancing in the neon lights",
"think": true,
"bpm": 120,
"duration_seconds": 225,
"keyscale": "Ab major",
"language": "ja",
"timesignature": "4"
}' \
--output disco.wav
```
## Error Responses
| Status Code | Description |
|-------------|--------------------------------------------------|
| 400 | Missing or invalid model or request parameters |
| 500 | Backend error during sound generation |

View File

@@ -0,0 +1,115 @@
+++
disableToc = false
title = "Video Generation"
weight = 18
url = "/features/video-generation/"
+++
LocalAI can generate videos from text prompts and optional reference images via the `/video` endpoint. Supported backends include `diffusers`, `stablediffusion`, and `vllm-omni`.
## API
- **Method:** `POST`
- **Endpoint:** `/video`
### Request
The request body is JSON with the following fields:
| Parameter | Type | Required | Default | Description |
|-------------------|----------|----------|---------|----------------------------------------------------------|
| `model` | `string` | Yes | | Model name to use |
| `prompt` | `string` | Yes | | Text description of the video to generate |
| `negative_prompt` | `string` | No | | What to exclude from the generated video |
| `start_image` | `string` | No | | Starting image as base64 string or URL |
| `end_image` | `string` | No | | Ending image for guided generation |
| `width` | `int` | No | 512 | Video width in pixels |
| `height` | `int` | No | 512 | Video height in pixels |
| `num_frames` | `int` | No | | Number of frames |
| `fps` | `int` | No | | Frames per second |
| `seconds` | `string` | No | | Duration in seconds |
| `size` | `string` | No | | Size specification (alternative to width/height) |
| `input_reference` | `string` | No | | Input reference for the generation |
| `seed` | `int` | No | | Random seed for reproducibility |
| `cfg_scale` | `float` | No | | Classifier-free guidance scale |
| `step` | `int` | No | | Number of inference steps |
| `response_format` | `string` | No | `url` | `url` to return a file URL, `b64_json` for base64 output |
### Response
Returns an OpenAI-compatible JSON response:
| Field | Type | Description |
|-----------------|----------|------------------------------------------------|
| `created` | `int` | Unix timestamp of generation |
| `id` | `string` | Unique identifier (UUID) |
| `data` | `array` | Array of generated video items |
| `data[].url` | `string` | URL path to video file (if `response_format` is `url`) |
| `data[].b64_json` | `string` | Base64-encoded video (if `response_format` is `b64_json`) |
## Usage
### Generate a video from a text prompt
```bash
curl http://localhost:8080/video \
-H "Content-Type: application/json" \
-d '{
"model": "video-model",
"prompt": "A cat playing in a garden on a sunny day",
"width": 512,
"height": 512,
"num_frames": 16,
"fps": 8
}'
```
### Example response
```json
{
"created": 1709900000,
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"data": [
{
"url": "/generated-videos/abc123.mp4"
}
]
}
```
### Generate with a starting image
```bash
curl http://localhost:8080/video \
-H "Content-Type: application/json" \
-d '{
"model": "video-model",
"prompt": "A timelapse of flowers blooming",
"start_image": "https://example.com/flowers.jpg",
"num_frames": 24,
"fps": 12,
"seed": 42,
"cfg_scale": 7.5,
"step": 30
}'
```
### Get base64-encoded output
```bash
curl http://localhost:8080/video \
-H "Content-Type: application/json" \
-d '{
"model": "video-model",
"prompt": "Ocean waves on a beach",
"response_format": "b64_json"
}'
```
## Error Responses
| Status Code | Description |
|-------------|------------------------------------------------------|
| 400 | Missing or invalid model or request parameters |
| 500 | Backend error during video generation |

View File

@@ -0,0 +1,87 @@
+++
disableToc = false
title = "Voice Activity Detection (VAD)"
weight = 17
url = "/features/voice-activity-detection/"
+++
Voice Activity Detection (VAD) identifies segments of speech in audio data. LocalAI provides a `/v1/vad` endpoint powered by the [Silero VAD](https://github.com/snakers4/silero-vad) backend.
## API
- **Method:** `POST`
- **Endpoints:** `/v1/vad`, `/vad`
### Request
The request body is JSON with the following fields:
| Parameter | Type | Required | Description |
|-----------|------------|----------|------------------------------------------|
| `model` | `string` | Yes | Model name (e.g. `silero-vad`) |
| `audio` | `float32[]`| Yes | Array of audio samples (16kHz PCM float) |
### Response
Returns a JSON object with detected speech segments:
| Field | Type | Description |
|--------------------|-----------|------------------------------------|
| `segments` | `array` | List of detected speech segments |
| `segments[].start` | `float` | Start time in seconds |
| `segments[].end` | `float` | End time in seconds |
## Usage
### Example request
```bash
curl http://localhost:8080/v1/vad \
-H "Content-Type: application/json" \
-d '{
"model": "silero-vad",
"audio": [0.0012, -0.0045, 0.0053, -0.0021, ...]
}'
```
### Example response
```json
{
"segments": [
{
"start": 0.5,
"end": 2.3
},
{
"start": 3.1,
"end": 5.8
}
]
}
```
## Model Configuration
Create a YAML configuration file for the VAD model:
```yaml
name: silero-vad
backend: silero-vad
```
## Detection Parameters
The Silero VAD backend uses the following internal defaults:
- **Sample rate:** 16kHz
- **Threshold:** 0.5
- **Min silence duration:** 100ms
- **Speech pad duration:** 30ms
## Error Responses
| Status Code | Description |
|-------------|---------------------------------------------------|
| 400 | Missing or invalid `model` or `audio` field |
| 500 | Backend error during VAD processing |

View File

@@ -0,0 +1,85 @@
+++
disableToc = false
title = "System Info and Version"
weight = 23
url = "/reference/system-info/"
+++
LocalAI provides endpoints to inspect the running instance, including available backends, loaded models, and version information.
## System Information
- **Method:** `GET`
- **Endpoint:** `/system`
Returns available backends and currently loaded models.
### Response
| Field | Type | Description |
|-----------------|----------|-------------------------------------------|
| `backends` | `array` | List of available backend names (strings) |
| `loaded_models` | `array` | List of currently loaded models |
| `loaded_models[].id` | `string` | Model identifier |
### Usage
```bash
curl http://localhost:8080/system
```
### Example response
```json
{
"backends": [
"llama-cpp",
"huggingface",
"diffusers",
"whisper"
],
"loaded_models": [
{
"id": "my-llama-model"
},
{
"id": "whisper-1"
}
]
}
```
---
## Version
- **Method:** `GET`
- **Endpoint:** `/version`
Returns the LocalAI version and build commit.
### Response
| Field | Type | Description |
|-----------|----------|-------------------------------------------------|
| `version` | `string` | Version string in the format `version (commit)` |
### Usage
```bash
curl http://localhost:8080/version
```
### Example response
```json
{
"version": "2.26.0 (a1b2c3d4)"
}
```
## Error Responses
| Status Code | Description |
|-------------|------------------------------|
| 500 | Internal server error |