feat: Add documentation for undocumented API endpoints (#8852)

* feat: add documentation for undocumented API endpoints Creates comprehensive documentation for 8 previously undocumented endpoints: - Voice Activity Detection (/v1/vad) - Video Generation (/video) - Sound Generation (/v1/sound-generation) - Backend Monitor (/backend/monitor, /backend/shutdown) - Token Metrics (/tokenMetrics) - P2P endpoints (/api/p2p/* - 5 sub-endpoints) - System Info (/system, /version) Each documentation file includes HTTP method, request/response schemas, curl examples, sample JSON responses, and error codes. * docs: remove token-metrics endpoint documentation per review feedback The token-metrics endpoint is not wired into the HTTP router and should not be documented per reviewer request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: move system-info documentation to reference section Per review feedback, system-info endpoint docs are better suited for the reference section rather than features. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: localai-bot <localai-bot@noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-18 05:33:09 -04:00 · 2026-03-08 17:59:33 +01:00
parent ec8f2d7683
commit 9090bca920
7 changed files with 665 additions and 0 deletions
--- a/docs/content/features/_index.en.md
+++ b/docs/content/features/_index.en.md
@@ -14,6 +14,10 @@ LocalAI provides a comprehensive set of features for running AI models locally.
 - **[Text Generation](text-generation/)** - Generate text with GPT-compatible models using various backends
 - **[Image Generation](image-generation/)** - Create images with Stable Diffusion and other diffusion models
 - **[Audio Processing](audio-to-text/)** - Transcribe audio to text and generate speech from text
+- **[Text to Audio](text-to-audio/)** - Generate speech from text with TTS models
+- **[Sound Generation](sound-generation/)** - Generate music and sound effects from text descriptions
+- **[Voice Activity Detection](voice-activity-detection/)** - Detect speech segments in audio data
+- **[Video Generation](video-generation/)** - Generate videos from text prompts and reference images
 - **[Embeddings](embeddings/)** - Generate vector embeddings for semantic search and RAG applications
 - **[GPT Vision](gpt-vision/)** - Analyze and understand images with vision-language models

@@ -24,6 +28,7 @@ LocalAI provides a comprehensive set of features for running AI models locally.
 - **[Constrained Grammars](constrained_grammars/)** - Control model output format with BNF grammars
 - **[GPU Acceleration](GPU-acceleration/)** - Optimize performance with GPU support
 - **[Distributed Inference](distributed_inferencing/)** - Scale inference across multiple nodes
+- **[P2P API](p2p/)** - Monitor and manage P2P worker and federated nodes
 - **[Model Context Protocol (MCP)](mcp/)** - Enable agentic capabilities with MCP integration
 - **[Agents](agents/)** - Autonomous AI agents with tools, knowledge base, and skills

@@ -34,6 +39,7 @@ LocalAI provides a comprehensive set of features for running AI models locally.
 - **[Stores](stores/)** - Vector similarity search for embeddings
 - **[Model Gallery](model-gallery/)** - Browse and install pre-configured models
 - **[Backends](backends/)** - Learn about available backends and how to manage them
+- **[Backend Monitor](backend-monitor/)** - Monitor backend status and resource usage
 - **[Runtime Settings](runtime-settings/)** - Configure application settings via web UI without restarting

 ## Getting Started
--- a/docs/content/features/backend-monitor.md
+++ b/docs/content/features/backend-monitor.md
@@ -0,0 +1,93 @@
+++
+disableToc = false
+title = "Backend Monitor"
+weight = 20
+url = "/features/backend-monitor/"
+++
+
+LocalAI provides endpoints to monitor and manage running backends. The `/backend/monitor` endpoint reports the status and resource usage of loaded models, and `/backend/shutdown` allows stopping a model's backend process.
+
+## Monitor API
+
+- **Method:** `GET`
+- **Endpoints:** `/backend/monitor`, `/v1/backend/monitor`
+
+### Request
+
+The request body is JSON:
+
+| Parameter | Type     | Required | Description                    |
+|-----------|----------|----------|--------------------------------|
+| `model`   | `string` | Yes      | Name of the model to monitor   |
+
+### Response
+
+Returns a JSON object with the backend status:
+
+| Field                | Type     | Description                                           |
+|----------------------|----------|-------------------------------------------------------|
+| `state`              | `int`    | Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error |
+| `memory`             | `object` | Memory usage information                              |
+| `memory.total`       | `uint64` | Total memory usage in bytes                           |
+| `memory.breakdown`   | `object` | Per-component memory breakdown (key-value pairs)      |
+
+If the gRPC status call fails, the endpoint falls back to local process metrics:
+
+| Field            | Type    | Description                    |
+|------------------|---------|--------------------------------|
+| `memory_info`    | `object`| Process memory info (RSS, VMS) |
+| `memory_percent` | `float` | Memory usage percentage        |
+| `cpu_percent`    | `float` | CPU usage percentage           |
+
+### Usage
+
+```bash
+curl http://localhost:8080/backend/monitor \
+  -H "Content-Type: application/json" \
+  -d '{"model": "my-model"}'
+```
+
+### Example response
+
+```json
+{
+  "state": 2,
+  "memory": {
+    "total": 1073741824,
+    "breakdown": {
+      "weights": 536870912,
+      "kv_cache": 268435456
+    }
+  }
+}
+```
+
+## Shutdown API
+
+- **Method:** `POST`
+- **Endpoints:** `/backend/shutdown`, `/v1/backend/shutdown`
+
+### Request
+
+| Parameter | Type     | Required | Description                     |
+|-----------|----------|----------|---------------------------------|
+| `model`   | `string` | Yes      | Name of the model to shut down  |
+
+### Usage
+
+```bash
+curl -X POST http://localhost:8080/backend/shutdown \
+  -H "Content-Type: application/json" \
+  -d '{"model": "my-model"}'
+```
+
+### Response
+
+Returns `200 OK` with the shutdown confirmation message on success.
+
+## Error Responses
+
+| Status Code | Description                                    |
+|-------------|------------------------------------------------|
+| 400         | Invalid or missing model name                  |
+| 500         | Backend error or model not loaded              |
--- a/docs/content/features/p2p.md
+++ b/docs/content/features/p2p.md
@@ -0,0 +1,175 @@
+++
+disableToc = false
+title = "P2P API"
+weight = 22
+url = "/features/p2p/"
+++
+
+LocalAI supports peer-to-peer (P2P) networking for distributed inference. The P2P API endpoints allow you to monitor connected worker and federated nodes, retrieve the P2P network token, and get cluster statistics.
+
+For an overview of distributed inference setup, see [Distributed Inference](/features/distributed_inferencing/).
+
+## Endpoints
+
+### List all P2P nodes
+
+- **Method:** `GET`
+- **Endpoint:** `/api/p2p`
+
+Returns all worker and federated nodes in the P2P network.
+
+#### Response
+
+| Field              | Type    | Description                          |
+|--------------------|---------|--------------------------------------|
+| `nodes`            | `array` | List of worker nodes                 |
+| `federated_nodes`  | `array` | List of federated nodes              |
+
+Each node object:
+
+| Field            | Type     | Description                              |
+|------------------|----------|------------------------------------------|
+| `Name`           | `string` | Node name                                |
+| `ID`             | `string` | Unique node identifier                   |
+| `TunnelAddress`  | `string` | Network tunnel address                   |
+| `ServiceID`      | `string` | Service identifier                       |
+| `LastSeen`       | `string` | ISO 8601 timestamp of last heartbeat     |
+
+#### Usage
+
+```bash
+curl http://localhost:8080/api/p2p
+```
+
+#### Example response
+
+```json
+{
+  "nodes": [
+    {
+      "Name": "worker-1",
+      "ID": "abc123",
+      "TunnelAddress": "192.168.1.10:9090",
+      "ServiceID": "worker",
+      "LastSeen": "2025-01-15T10:30:00Z"
+    }
+  ],
+  "federated_nodes": [
+    {
+      "Name": "federation-1",
+      "ID": "def456",
+      "TunnelAddress": "192.168.1.20:9090",
+      "ServiceID": "federated",
+      "LastSeen": "2025-01-15T10:30:05Z"
+    }
+  ]
+}
+```
+
+---
+
+### Get P2P token
+
+- **Method:** `GET`
+- **Endpoint:** `/api/p2p/token`
+
+Returns the P2P network token used for node authentication.
+
+#### Usage
+
+```bash
+curl http://localhost:8080/api/p2p/token
+```
+
+#### Response
+
+Returns the token as a plain text string.
+
+---
+
+### List worker nodes
+
+- **Method:** `GET`
+- **Endpoint:** `/api/p2p/workers`
+
+Returns worker nodes with online status.
+
+#### Response
+
+| Field                    | Type     | Description                          |
+|--------------------------|----------|--------------------------------------|
+| `nodes`                  | `array`  | List of worker nodes                 |
+| `nodes[].name`           | `string` | Node name                            |
+| `nodes[].id`             | `string` | Unique node identifier               |
+| `nodes[].tunnelAddress`  | `string` | Network tunnel address               |
+| `nodes[].serviceID`      | `string` | Service identifier                   |
+| `nodes[].lastSeen`       | `string` | Last heartbeat timestamp             |
+| `nodes[].isOnline`       | `bool`   | Whether the node is currently online |
+
+A node is considered online if it was last seen within the past 40 seconds.
+
+#### Usage
+
+```bash
+curl http://localhost:8080/api/p2p/workers
+```
+
+---
+
+### List federated nodes
+
+- **Method:** `GET`
+- **Endpoint:** `/api/p2p/federation`
+
+Returns federated nodes with online status. Same response format as `/api/p2p/workers`.
+
+#### Usage
+
+```bash
+curl http://localhost:8080/api/p2p/federation
+```
+
+---
+
+### Get P2P statistics
+
+- **Method:** `GET`
+- **Endpoint:** `/api/p2p/stats`
+
+Returns aggregate statistics about the P2P cluster.
+
+#### Response
+
+| Field              | Type     | Description                       |
+|--------------------|----------|-----------------------------------|
+| `workers.online`   | `int`    | Number of online worker nodes     |
+| `workers.total`    | `int`    | Total worker nodes                |
+| `federated.online` | `int`    | Number of online federated nodes  |
+| `federated.total`  | `int`    | Total federated nodes             |
+
+#### Usage
+
+```bash
+curl http://localhost:8080/api/p2p/stats
+```
+
+#### Example response
+
+```json
+{
+  "workers": {
+    "online": 3,
+    "total": 5
+  },
+  "federated": {
+    "online": 2,
+    "total": 2
+  }
+}
+```
+
+## Error Responses
+
+| Status Code | Description                                 |
+|-------------|---------------------------------------------|
+| 500         | P2P subsystem not available or internal error |
--- a/docs/content/features/sound-generation.md
+++ b/docs/content/features/sound-generation.md
@@ -0,0 +1,104 @@
+++
+disableToc = false
+title = "Sound Generation"
+weight = 19
+url = "/features/sound-generation/"
+++
+
+LocalAI supports generating audio from text descriptions via the `/v1/sound-generation` endpoint. This endpoint is compatible with the [ElevenLabs sound generation API](https://elevenlabs.io/docs/api-reference/sound-generation) and can produce music, sound effects, and other audio content.
+
+## API
+
+- **Method:** `POST`
+- **Endpoint:** `/v1/sound-generation`
+
+### Request
+
+The request body is JSON. There are two usage modes: simple and advanced.
+
+#### Simple mode
+
+| Parameter        | Type     | Required | Description                                  |
+|------------------|----------|----------|----------------------------------------------|
+| `model_id`       | `string` | Yes      | Model identifier                             |
+| `text`           | `string` | Yes      | Audio description or prompt                  |
+| `instrumental`   | `bool`   | No       | Generate instrumental audio (no vocals)      |
+| `vocal_language` | `string` | No       | Language code for vocals (e.g. `bn`, `ja`)   |
+
+#### Advanced mode
+
+| Parameter           | Type     | Required | Description                                     |
+|---------------------|----------|----------|-------------------------------------------------|
+| `model_id`          | `string` | Yes      | Model identifier                                |
+| `text`              | `string` | Yes      | Text prompt or description                      |
+| `duration_seconds`  | `float`  | No       | Target duration in seconds                      |
+| `prompt_influence`  | `float`  | No       | Temperature / prompt influence parameter        |
+| `do_sample`         | `bool`   | No       | Enable sampling                                 |
+| `think`             | `bool`   | No       | Enable extended thinking for generation         |
+| `caption`           | `string` | No       | Caption describing the audio                    |
+| `lyrics`            | `string` | No       | Lyrics for the generated audio                  |
+| `bpm`               | `int`    | No       | Beats per minute                                |
+| `keyscale`          | `string` | No       | Musical key/scale (e.g. `Ab major`)             |
+| `language`          | `string` | No       | Language code                                   |
+| `vocal_language`    | `string` | No       | Vocal language (fallback if `language` is empty) |
+| `timesignature`     | `string` | No       | Time signature (e.g. `4`)                       |
+| `instrumental`      | `bool`   | No       | Generate instrumental audio (no vocals)         |
+
+### Response
+
+Returns a binary audio file with the appropriate `Content-Type` header (e.g. `audio/wav`, `audio/mpeg`, `audio/flac`, `audio/ogg`).
+
+## Usage
+
+### Generate a sound effect
+
+```bash
+curl http://localhost:8080/v1/sound-generation \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model_id": "sound-model",
+    "text": "rain falling on a tin roof"
+  }' \
+  --output rain.wav
+```
+
+### Generate a song with vocals
+
+```bash
+curl http://localhost:8080/v1/sound-generation \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model_id": "sound-model",
+    "text": "a soft Bengali love song for a quiet evening",
+    "instrumental": false,
+    "vocal_language": "bn"
+  }' \
+  --output song.wav
+```
+
+### Generate music with advanced parameters
+
+```bash
+curl http://localhost:8080/v1/sound-generation \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model_id": "sound-model",
+    "text": "upbeat pop",
+    "caption": "A funky Japanese disco track",
+    "lyrics": "[Verse 1]\nDancing in the neon lights",
+    "think": true,
+    "bpm": 120,
+    "duration_seconds": 225,
+    "keyscale": "Ab major",
+    "language": "ja",
+    "timesignature": "4"
+  }' \
+  --output disco.wav
+```
+
+## Error Responses
+
+| Status Code | Description                                      |
+|-------------|--------------------------------------------------|
+| 400         | Missing or invalid model or request parameters   |
+| 500         | Backend error during sound generation            |
--- a/docs/content/features/video-generation.md
+++ b/docs/content/features/video-generation.md
@@ -0,0 +1,115 @@
+++
+disableToc = false
+title = "Video Generation"
+weight = 18
+url = "/features/video-generation/"
+++
+
+LocalAI can generate videos from text prompts and optional reference images via the `/video` endpoint. Supported backends include `diffusers`, `stablediffusion`, and `vllm-omni`.
+
+## API
+
+- **Method:** `POST`
+- **Endpoint:** `/video`
+
+### Request
+
+The request body is JSON with the following fields:
+
+| Parameter         | Type     | Required | Default | Description                                              |
+|-------------------|----------|----------|---------|----------------------------------------------------------|
+| `model`           | `string` | Yes      |         | Model name to use                                        |
+| `prompt`          | `string` | Yes      |         | Text description of the video to generate                |
+| `negative_prompt` | `string` | No       |         | What to exclude from the generated video                 |
+| `start_image`     | `string` | No       |         | Starting image as base64 string or URL                   |
+| `end_image`       | `string` | No       |         | Ending image for guided generation                       |
+| `width`           | `int`    | No       | 512     | Video width in pixels                                    |
+| `height`          | `int`    | No       | 512     | Video height in pixels                                   |
+| `num_frames`      | `int`    | No       |         | Number of frames                                         |
+| `fps`             | `int`    | No       |         | Frames per second                                        |
+| `seconds`         | `string` | No       |         | Duration in seconds                                      |
+| `size`            | `string` | No       |         | Size specification (alternative to width/height)         |
+| `input_reference` | `string` | No       |         | Input reference for the generation                       |
+| `seed`            | `int`    | No       |         | Random seed for reproducibility                          |
+| `cfg_scale`       | `float`  | No       |         | Classifier-free guidance scale                           |
+| `step`            | `int`    | No       |         | Number of inference steps                                |
+| `response_format` | `string` | No       | `url`   | `url` to return a file URL, `b64_json` for base64 output |
+
+### Response
+
+Returns an OpenAI-compatible JSON response:
+
+| Field           | Type     | Description                                    |
+|-----------------|----------|------------------------------------------------|
+| `created`       | `int`    | Unix timestamp of generation                   |
+| `id`            | `string` | Unique identifier (UUID)                       |
+| `data`          | `array`  | Array of generated video items                 |
+| `data[].url`    | `string` | URL path to video file (if `response_format` is `url`) |
+| `data[].b64_json` | `string` | Base64-encoded video (if `response_format` is `b64_json`) |
+
+## Usage
+
+### Generate a video from a text prompt
+
+```bash
+curl http://localhost:8080/video \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "video-model",
+    "prompt": "A cat playing in a garden on a sunny day",
+    "width": 512,
+    "height": 512,
+    "num_frames": 16,
+    "fps": 8
+  }'
+```
+
+### Example response
+
+```json
+{
+  "created": 1709900000,
+  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
+  "data": [
+    {
+      "url": "/generated-videos/abc123.mp4"
+    }
+  ]
+}
+```
+
+### Generate with a starting image
+
+```bash
+curl http://localhost:8080/video \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "video-model",
+    "prompt": "A timelapse of flowers blooming",
+    "start_image": "https://example.com/flowers.jpg",
+    "num_frames": 24,
+    "fps": 12,
+    "seed": 42,
+    "cfg_scale": 7.5,
+    "step": 30
+  }'
+```
+
+### Get base64-encoded output
+
+```bash
+curl http://localhost:8080/video \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "video-model",
+    "prompt": "Ocean waves on a beach",
+    "response_format": "b64_json"
+  }'
+```
+
+## Error Responses
+
+| Status Code | Description                                          |
+|-------------|------------------------------------------------------|
+| 400         | Missing or invalid model or request parameters       |
+| 500         | Backend error during video generation                |
--- a/docs/content/features/voice-activity-detection.md
+++ b/docs/content/features/voice-activity-detection.md
@@ -0,0 +1,87 @@
+++
+disableToc = false
+title = "Voice Activity Detection (VAD)"
+weight = 17
+url = "/features/voice-activity-detection/"
+++
+
+Voice Activity Detection (VAD) identifies segments of speech in audio data. LocalAI provides a `/v1/vad` endpoint powered by the [Silero VAD](https://github.com/snakers4/silero-vad) backend.
+
+## API
+
+- **Method:** `POST`
+- **Endpoints:** `/v1/vad`, `/vad`
+
+### Request
+
+The request body is JSON with the following fields:
+
+| Parameter | Type       | Required | Description                              |
+|-----------|------------|----------|------------------------------------------|
+| `model`   | `string`   | Yes      | Model name (e.g. `silero-vad`)           |
+| `audio`   | `float32[]`| Yes      | Array of audio samples (16kHz PCM float) |
+
+### Response
+
+Returns a JSON object with detected speech segments:
+
+| Field              | Type      | Description                        |
+|--------------------|-----------|------------------------------------|
+| `segments`         | `array`   | List of detected speech segments   |
+| `segments[].start` | `float`   | Start time in seconds              |
+| `segments[].end`   | `float`   | End time in seconds                |
+
+## Usage
+
+### Example request
+
+```bash
+curl http://localhost:8080/v1/vad \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "silero-vad",
+    "audio": [0.0012, -0.0045, 0.0053, -0.0021, ...]
+  }'
+```
+
+### Example response
+
+```json
+{
+  "segments": [
+    {
+      "start": 0.5,
+      "end": 2.3
+    },
+    {
+      "start": 3.1,
+      "end": 5.8
+    }
+  ]
+}
+```
+
+## Model Configuration
+
+Create a YAML configuration file for the VAD model:
+
+```yaml
+name: silero-vad
+backend: silero-vad
+```
+
+## Detection Parameters
+
+The Silero VAD backend uses the following internal defaults:
+
+- **Sample rate:** 16kHz
+- **Threshold:** 0.5
+- **Min silence duration:** 100ms
+- **Speech pad duration:** 30ms
+
+## Error Responses
+
+| Status Code | Description                                       |
+|-------------|---------------------------------------------------|
+| 400         | Missing or invalid `model` or `audio` field       |
+| 500         | Backend error during VAD processing               |
--- a/docs/content/reference/system-info.md
+++ b/docs/content/reference/system-info.md
@@ -0,0 +1,85 @@
+++
+disableToc = false
+title = "System Info and Version"
+weight = 23
+url = "/reference/system-info/"
+++
+
+LocalAI provides endpoints to inspect the running instance, including available backends, loaded models, and version information.
+
+## System Information
+
+- **Method:** `GET`
+- **Endpoint:** `/system`
+
+Returns available backends and currently loaded models.
+
+### Response
+
+| Field           | Type     | Description                               |
+|-----------------|----------|-------------------------------------------|
+| `backends`      | `array`  | List of available backend names (strings) |
+| `loaded_models` | `array`  | List of currently loaded models           |
+| `loaded_models[].id` | `string` | Model identifier                    |
+
+### Usage
+
+```bash
+curl http://localhost:8080/system
+```
+
+### Example response
+
+```json
+{
+  "backends": [
+    "llama-cpp",
+    "huggingface",
+    "diffusers",
+    "whisper"
+  ],
+  "loaded_models": [
+    {
+      "id": "my-llama-model"
+    },
+    {
+      "id": "whisper-1"
+    }
+  ]
+}
+```
+
+---
+
+## Version
+
+- **Method:** `GET`
+- **Endpoint:** `/version`
+
+Returns the LocalAI version and build commit.
+
+### Response
+
+| Field     | Type     | Description                                     |
+|-----------|----------|-------------------------------------------------|
+| `version` | `string` | Version string in the format `version (commit)` |
+
+### Usage
+
+```bash
+curl http://localhost:8080/version
+```
+
+### Example response
+
+```json
+{
+  "version": "2.26.0 (a1b2c3d4)"
+}
+```
+
+## Error Responses
+
+| Status Code | Description                  |
+|-------------|------------------------------|
+| 500         | Internal server error        |