From 9090bca9203bde85041c76c92fcfd4eef1f5c2c9 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 8 Mar 2026 17:59:33 +0100 Subject: [PATCH] feat: Add documentation for undocumented API endpoints (#8852) * feat: add documentation for undocumented API endpoints Creates comprehensive documentation for 8 previously undocumented endpoints: - Voice Activity Detection (/v1/vad) - Video Generation (/video) - Sound Generation (/v1/sound-generation) - Backend Monitor (/backend/monitor, /backend/shutdown) - Token Metrics (/tokenMetrics) - P2P endpoints (/api/p2p/* - 5 sub-endpoints) - System Info (/system, /version) Each documentation file includes HTTP method, request/response schemas, curl examples, sample JSON responses, and error codes. * docs: remove token-metrics endpoint documentation per review feedback The token-metrics endpoint is not wired into the HTTP router and should not be documented per reviewer request. Co-Authored-By: Claude Opus 4.6 * docs: move system-info documentation to reference section Per review feedback, system-info endpoint docs are better suited for the reference section rather than features. Co-Authored-By: Claude Opus 4.6 --------- Co-authored-by: localai-bot Co-authored-by: Claude Opus 4.6 --- docs/content/features/_index.en.md | 6 + docs/content/features/backend-monitor.md | 93 ++++++++++ docs/content/features/p2p.md | 175 ++++++++++++++++++ docs/content/features/sound-generation.md | 104 +++++++++++ docs/content/features/video-generation.md | 115 ++++++++++++ .../features/voice-activity-detection.md | 87 +++++++++ docs/content/reference/system-info.md | 85 +++++++++ 7 files changed, 665 insertions(+) create mode 100644 docs/content/features/backend-monitor.md create mode 100644 docs/content/features/p2p.md create mode 100644 docs/content/features/sound-generation.md create mode 100644 docs/content/features/video-generation.md create mode 100644 docs/content/features/voice-activity-detection.md create mode 100644 docs/content/reference/system-info.md diff --git a/docs/content/features/_index.en.md b/docs/content/features/_index.en.md index 7fe23127d..22adca087 100644 --- a/docs/content/features/_index.en.md +++ b/docs/content/features/_index.en.md @@ -14,6 +14,10 @@ LocalAI provides a comprehensive set of features for running AI models locally. - **[Text Generation](text-generation/)** - Generate text with GPT-compatible models using various backends - **[Image Generation](image-generation/)** - Create images with Stable Diffusion and other diffusion models - **[Audio Processing](audio-to-text/)** - Transcribe audio to text and generate speech from text +- **[Text to Audio](text-to-audio/)** - Generate speech from text with TTS models +- **[Sound Generation](sound-generation/)** - Generate music and sound effects from text descriptions +- **[Voice Activity Detection](voice-activity-detection/)** - Detect speech segments in audio data +- **[Video Generation](video-generation/)** - Generate videos from text prompts and reference images - **[Embeddings](embeddings/)** - Generate vector embeddings for semantic search and RAG applications - **[GPT Vision](gpt-vision/)** - Analyze and understand images with vision-language models @@ -24,6 +28,7 @@ LocalAI provides a comprehensive set of features for running AI models locally. - **[Constrained Grammars](constrained_grammars/)** - Control model output format with BNF grammars - **[GPU Acceleration](GPU-acceleration/)** - Optimize performance with GPU support - **[Distributed Inference](distributed_inferencing/)** - Scale inference across multiple nodes +- **[P2P API](p2p/)** - Monitor and manage P2P worker and federated nodes - **[Model Context Protocol (MCP)](mcp/)** - Enable agentic capabilities with MCP integration - **[Agents](agents/)** - Autonomous AI agents with tools, knowledge base, and skills @@ -34,6 +39,7 @@ LocalAI provides a comprehensive set of features for running AI models locally. - **[Stores](stores/)** - Vector similarity search for embeddings - **[Model Gallery](model-gallery/)** - Browse and install pre-configured models - **[Backends](backends/)** - Learn about available backends and how to manage them +- **[Backend Monitor](backend-monitor/)** - Monitor backend status and resource usage - **[Runtime Settings](runtime-settings/)** - Configure application settings via web UI without restarting ## Getting Started diff --git a/docs/content/features/backend-monitor.md b/docs/content/features/backend-monitor.md new file mode 100644 index 000000000..b1d16eab6 --- /dev/null +++ b/docs/content/features/backend-monitor.md @@ -0,0 +1,93 @@ ++++ +disableToc = false +title = "Backend Monitor" +weight = 20 +url = "/features/backend-monitor/" ++++ + +LocalAI provides endpoints to monitor and manage running backends. The `/backend/monitor` endpoint reports the status and resource usage of loaded models, and `/backend/shutdown` allows stopping a model's backend process. + +## Monitor API + +- **Method:** `GET` +- **Endpoints:** `/backend/monitor`, `/v1/backend/monitor` + +### Request + +The request body is JSON: + +| Parameter | Type | Required | Description | +|-----------|----------|----------|--------------------------------| +| `model` | `string` | Yes | Name of the model to monitor | + +### Response + +Returns a JSON object with the backend status: + +| Field | Type | Description | +|----------------------|----------|-------------------------------------------------------| +| `state` | `int` | Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error | +| `memory` | `object` | Memory usage information | +| `memory.total` | `uint64` | Total memory usage in bytes | +| `memory.breakdown` | `object` | Per-component memory breakdown (key-value pairs) | + +If the gRPC status call fails, the endpoint falls back to local process metrics: + +| Field | Type | Description | +|------------------|---------|--------------------------------| +| `memory_info` | `object`| Process memory info (RSS, VMS) | +| `memory_percent` | `float` | Memory usage percentage | +| `cpu_percent` | `float` | CPU usage percentage | + +### Usage + +```bash +curl http://localhost:8080/backend/monitor \ + -H "Content-Type: application/json" \ + -d '{"model": "my-model"}' +``` + +### Example response + +```json +{ + "state": 2, + "memory": { + "total": 1073741824, + "breakdown": { + "weights": 536870912, + "kv_cache": 268435456 + } + } +} +``` + +## Shutdown API + +- **Method:** `POST` +- **Endpoints:** `/backend/shutdown`, `/v1/backend/shutdown` + +### Request + +| Parameter | Type | Required | Description | +|-----------|----------|----------|---------------------------------| +| `model` | `string` | Yes | Name of the model to shut down | + +### Usage + +```bash +curl -X POST http://localhost:8080/backend/shutdown \ + -H "Content-Type: application/json" \ + -d '{"model": "my-model"}' +``` + +### Response + +Returns `200 OK` with the shutdown confirmation message on success. + +## Error Responses + +| Status Code | Description | +|-------------|------------------------------------------------| +| 400 | Invalid or missing model name | +| 500 | Backend error or model not loaded | diff --git a/docs/content/features/p2p.md b/docs/content/features/p2p.md new file mode 100644 index 000000000..5016c4efe --- /dev/null +++ b/docs/content/features/p2p.md @@ -0,0 +1,175 @@ ++++ +disableToc = false +title = "P2P API" +weight = 22 +url = "/features/p2p/" ++++ + +LocalAI supports peer-to-peer (P2P) networking for distributed inference. The P2P API endpoints allow you to monitor connected worker and federated nodes, retrieve the P2P network token, and get cluster statistics. + +For an overview of distributed inference setup, see [Distributed Inference](/features/distributed_inferencing/). + +## Endpoints + +### List all P2P nodes + +- **Method:** `GET` +- **Endpoint:** `/api/p2p` + +Returns all worker and federated nodes in the P2P network. + +#### Response + +| Field | Type | Description | +|--------------------|---------|--------------------------------------| +| `nodes` | `array` | List of worker nodes | +| `federated_nodes` | `array` | List of federated nodes | + +Each node object: + +| Field | Type | Description | +|------------------|----------|------------------------------------------| +| `Name` | `string` | Node name | +| `ID` | `string` | Unique node identifier | +| `TunnelAddress` | `string` | Network tunnel address | +| `ServiceID` | `string` | Service identifier | +| `LastSeen` | `string` | ISO 8601 timestamp of last heartbeat | + +#### Usage + +```bash +curl http://localhost:8080/api/p2p +``` + +#### Example response + +```json +{ + "nodes": [ + { + "Name": "worker-1", + "ID": "abc123", + "TunnelAddress": "192.168.1.10:9090", + "ServiceID": "worker", + "LastSeen": "2025-01-15T10:30:00Z" + } + ], + "federated_nodes": [ + { + "Name": "federation-1", + "ID": "def456", + "TunnelAddress": "192.168.1.20:9090", + "ServiceID": "federated", + "LastSeen": "2025-01-15T10:30:05Z" + } + ] +} +``` + +--- + +### Get P2P token + +- **Method:** `GET` +- **Endpoint:** `/api/p2p/token` + +Returns the P2P network token used for node authentication. + +#### Usage + +```bash +curl http://localhost:8080/api/p2p/token +``` + +#### Response + +Returns the token as a plain text string. + +--- + +### List worker nodes + +- **Method:** `GET` +- **Endpoint:** `/api/p2p/workers` + +Returns worker nodes with online status. + +#### Response + +| Field | Type | Description | +|--------------------------|----------|--------------------------------------| +| `nodes` | `array` | List of worker nodes | +| `nodes[].name` | `string` | Node name | +| `nodes[].id` | `string` | Unique node identifier | +| `nodes[].tunnelAddress` | `string` | Network tunnel address | +| `nodes[].serviceID` | `string` | Service identifier | +| `nodes[].lastSeen` | `string` | Last heartbeat timestamp | +| `nodes[].isOnline` | `bool` | Whether the node is currently online | + +A node is considered online if it was last seen within the past 40 seconds. + +#### Usage + +```bash +curl http://localhost:8080/api/p2p/workers +``` + +--- + +### List federated nodes + +- **Method:** `GET` +- **Endpoint:** `/api/p2p/federation` + +Returns federated nodes with online status. Same response format as `/api/p2p/workers`. + +#### Usage + +```bash +curl http://localhost:8080/api/p2p/federation +``` + +--- + +### Get P2P statistics + +- **Method:** `GET` +- **Endpoint:** `/api/p2p/stats` + +Returns aggregate statistics about the P2P cluster. + +#### Response + +| Field | Type | Description | +|--------------------|----------|-----------------------------------| +| `workers.online` | `int` | Number of online worker nodes | +| `workers.total` | `int` | Total worker nodes | +| `federated.online` | `int` | Number of online federated nodes | +| `federated.total` | `int` | Total federated nodes | + +#### Usage + +```bash +curl http://localhost:8080/api/p2p/stats +``` + +#### Example response + +```json +{ + "workers": { + "online": 3, + "total": 5 + }, + "federated": { + "online": 2, + "total": 2 + } +} +``` + +## Error Responses + +| Status Code | Description | +|-------------|---------------------------------------------| +| 500 | P2P subsystem not available or internal error | diff --git a/docs/content/features/sound-generation.md b/docs/content/features/sound-generation.md new file mode 100644 index 000000000..a7478d3a1 --- /dev/null +++ b/docs/content/features/sound-generation.md @@ -0,0 +1,104 @@ ++++ +disableToc = false +title = "Sound Generation" +weight = 19 +url = "/features/sound-generation/" ++++ + +LocalAI supports generating audio from text descriptions via the `/v1/sound-generation` endpoint. This endpoint is compatible with the [ElevenLabs sound generation API](https://elevenlabs.io/docs/api-reference/sound-generation) and can produce music, sound effects, and other audio content. + +## API + +- **Method:** `POST` +- **Endpoint:** `/v1/sound-generation` + +### Request + +The request body is JSON. There are two usage modes: simple and advanced. + +#### Simple mode + +| Parameter | Type | Required | Description | +|------------------|----------|----------|----------------------------------------------| +| `model_id` | `string` | Yes | Model identifier | +| `text` | `string` | Yes | Audio description or prompt | +| `instrumental` | `bool` | No | Generate instrumental audio (no vocals) | +| `vocal_language` | `string` | No | Language code for vocals (e.g. `bn`, `ja`) | + +#### Advanced mode + +| Parameter | Type | Required | Description | +|---------------------|----------|----------|-------------------------------------------------| +| `model_id` | `string` | Yes | Model identifier | +| `text` | `string` | Yes | Text prompt or description | +| `duration_seconds` | `float` | No | Target duration in seconds | +| `prompt_influence` | `float` | No | Temperature / prompt influence parameter | +| `do_sample` | `bool` | No | Enable sampling | +| `think` | `bool` | No | Enable extended thinking for generation | +| `caption` | `string` | No | Caption describing the audio | +| `lyrics` | `string` | No | Lyrics for the generated audio | +| `bpm` | `int` | No | Beats per minute | +| `keyscale` | `string` | No | Musical key/scale (e.g. `Ab major`) | +| `language` | `string` | No | Language code | +| `vocal_language` | `string` | No | Vocal language (fallback if `language` is empty) | +| `timesignature` | `string` | No | Time signature (e.g. `4`) | +| `instrumental` | `bool` | No | Generate instrumental audio (no vocals) | + +### Response + +Returns a binary audio file with the appropriate `Content-Type` header (e.g. `audio/wav`, `audio/mpeg`, `audio/flac`, `audio/ogg`). + +## Usage + +### Generate a sound effect + +```bash +curl http://localhost:8080/v1/sound-generation \ + -H "Content-Type: application/json" \ + -d '{ + "model_id": "sound-model", + "text": "rain falling on a tin roof" + }' \ + --output rain.wav +``` + +### Generate a song with vocals + +```bash +curl http://localhost:8080/v1/sound-generation \ + -H "Content-Type: application/json" \ + -d '{ + "model_id": "sound-model", + "text": "a soft Bengali love song for a quiet evening", + "instrumental": false, + "vocal_language": "bn" + }' \ + --output song.wav +``` + +### Generate music with advanced parameters + +```bash +curl http://localhost:8080/v1/sound-generation \ + -H "Content-Type: application/json" \ + -d '{ + "model_id": "sound-model", + "text": "upbeat pop", + "caption": "A funky Japanese disco track", + "lyrics": "[Verse 1]\nDancing in the neon lights", + "think": true, + "bpm": 120, + "duration_seconds": 225, + "keyscale": "Ab major", + "language": "ja", + "timesignature": "4" + }' \ + --output disco.wav +``` + +## Error Responses + +| Status Code | Description | +|-------------|--------------------------------------------------| +| 400 | Missing or invalid model or request parameters | +| 500 | Backend error during sound generation | diff --git a/docs/content/features/video-generation.md b/docs/content/features/video-generation.md new file mode 100644 index 000000000..5bde79445 --- /dev/null +++ b/docs/content/features/video-generation.md @@ -0,0 +1,115 @@ ++++ +disableToc = false +title = "Video Generation" +weight = 18 +url = "/features/video-generation/" ++++ + +LocalAI can generate videos from text prompts and optional reference images via the `/video` endpoint. Supported backends include `diffusers`, `stablediffusion`, and `vllm-omni`. + +## API + +- **Method:** `POST` +- **Endpoint:** `/video` + +### Request + +The request body is JSON with the following fields: + +| Parameter | Type | Required | Default | Description | +|-------------------|----------|----------|---------|----------------------------------------------------------| +| `model` | `string` | Yes | | Model name to use | +| `prompt` | `string` | Yes | | Text description of the video to generate | +| `negative_prompt` | `string` | No | | What to exclude from the generated video | +| `start_image` | `string` | No | | Starting image as base64 string or URL | +| `end_image` | `string` | No | | Ending image for guided generation | +| `width` | `int` | No | 512 | Video width in pixels | +| `height` | `int` | No | 512 | Video height in pixels | +| `num_frames` | `int` | No | | Number of frames | +| `fps` | `int` | No | | Frames per second | +| `seconds` | `string` | No | | Duration in seconds | +| `size` | `string` | No | | Size specification (alternative to width/height) | +| `input_reference` | `string` | No | | Input reference for the generation | +| `seed` | `int` | No | | Random seed for reproducibility | +| `cfg_scale` | `float` | No | | Classifier-free guidance scale | +| `step` | `int` | No | | Number of inference steps | +| `response_format` | `string` | No | `url` | `url` to return a file URL, `b64_json` for base64 output | + +### Response + +Returns an OpenAI-compatible JSON response: + +| Field | Type | Description | +|-----------------|----------|------------------------------------------------| +| `created` | `int` | Unix timestamp of generation | +| `id` | `string` | Unique identifier (UUID) | +| `data` | `array` | Array of generated video items | +| `data[].url` | `string` | URL path to video file (if `response_format` is `url`) | +| `data[].b64_json` | `string` | Base64-encoded video (if `response_format` is `b64_json`) | + +## Usage + +### Generate a video from a text prompt + +```bash +curl http://localhost:8080/video \ + -H "Content-Type: application/json" \ + -d '{ + "model": "video-model", + "prompt": "A cat playing in a garden on a sunny day", + "width": 512, + "height": 512, + "num_frames": 16, + "fps": 8 + }' +``` + +### Example response + +```json +{ + "created": 1709900000, + "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", + "data": [ + { + "url": "/generated-videos/abc123.mp4" + } + ] +} +``` + +### Generate with a starting image + +```bash +curl http://localhost:8080/video \ + -H "Content-Type: application/json" \ + -d '{ + "model": "video-model", + "prompt": "A timelapse of flowers blooming", + "start_image": "https://example.com/flowers.jpg", + "num_frames": 24, + "fps": 12, + "seed": 42, + "cfg_scale": 7.5, + "step": 30 + }' +``` + +### Get base64-encoded output + +```bash +curl http://localhost:8080/video \ + -H "Content-Type: application/json" \ + -d '{ + "model": "video-model", + "prompt": "Ocean waves on a beach", + "response_format": "b64_json" + }' +``` + +## Error Responses + +| Status Code | Description | +|-------------|------------------------------------------------------| +| 400 | Missing or invalid model or request parameters | +| 500 | Backend error during video generation | diff --git a/docs/content/features/voice-activity-detection.md b/docs/content/features/voice-activity-detection.md new file mode 100644 index 000000000..a7ccc9600 --- /dev/null +++ b/docs/content/features/voice-activity-detection.md @@ -0,0 +1,87 @@ ++++ +disableToc = false +title = "Voice Activity Detection (VAD)" +weight = 17 +url = "/features/voice-activity-detection/" ++++ + +Voice Activity Detection (VAD) identifies segments of speech in audio data. LocalAI provides a `/v1/vad` endpoint powered by the [Silero VAD](https://github.com/snakers4/silero-vad) backend. + +## API + +- **Method:** `POST` +- **Endpoints:** `/v1/vad`, `/vad` + +### Request + +The request body is JSON with the following fields: + +| Parameter | Type | Required | Description | +|-----------|------------|----------|------------------------------------------| +| `model` | `string` | Yes | Model name (e.g. `silero-vad`) | +| `audio` | `float32[]`| Yes | Array of audio samples (16kHz PCM float) | + +### Response + +Returns a JSON object with detected speech segments: + +| Field | Type | Description | +|--------------------|-----------|------------------------------------| +| `segments` | `array` | List of detected speech segments | +| `segments[].start` | `float` | Start time in seconds | +| `segments[].end` | `float` | End time in seconds | + +## Usage + +### Example request + +```bash +curl http://localhost:8080/v1/vad \ + -H "Content-Type: application/json" \ + -d '{ + "model": "silero-vad", + "audio": [0.0012, -0.0045, 0.0053, -0.0021, ...] + }' +``` + +### Example response + +```json +{ + "segments": [ + { + "start": 0.5, + "end": 2.3 + }, + { + "start": 3.1, + "end": 5.8 + } + ] +} +``` + +## Model Configuration + +Create a YAML configuration file for the VAD model: + +```yaml +name: silero-vad +backend: silero-vad +``` + +## Detection Parameters + +The Silero VAD backend uses the following internal defaults: + +- **Sample rate:** 16kHz +- **Threshold:** 0.5 +- **Min silence duration:** 100ms +- **Speech pad duration:** 30ms + +## Error Responses + +| Status Code | Description | +|-------------|---------------------------------------------------| +| 400 | Missing or invalid `model` or `audio` field | +| 500 | Backend error during VAD processing | diff --git a/docs/content/reference/system-info.md b/docs/content/reference/system-info.md new file mode 100644 index 000000000..60a337e48 --- /dev/null +++ b/docs/content/reference/system-info.md @@ -0,0 +1,85 @@ ++++ +disableToc = false +title = "System Info and Version" +weight = 23 +url = "/reference/system-info/" ++++ + +LocalAI provides endpoints to inspect the running instance, including available backends, loaded models, and version information. + +## System Information + +- **Method:** `GET` +- **Endpoint:** `/system` + +Returns available backends and currently loaded models. + +### Response + +| Field | Type | Description | +|-----------------|----------|-------------------------------------------| +| `backends` | `array` | List of available backend names (strings) | +| `loaded_models` | `array` | List of currently loaded models | +| `loaded_models[].id` | `string` | Model identifier | + +### Usage + +```bash +curl http://localhost:8080/system +``` + +### Example response + +```json +{ + "backends": [ + "llama-cpp", + "huggingface", + "diffusers", + "whisper" + ], + "loaded_models": [ + { + "id": "my-llama-model" + }, + { + "id": "whisper-1" + } + ] +} +``` + +--- + +## Version + +- **Method:** `GET` +- **Endpoint:** `/version` + +Returns the LocalAI version and build commit. + +### Response + +| Field | Type | Description | +|-----------|----------|-------------------------------------------------| +| `version` | `string` | Version string in the format `version (commit)` | + +### Usage + +```bash +curl http://localhost:8080/version +``` + +### Example response + +```json +{ + "version": "2.26.0 (a1b2c3d4)" +} +``` + +## Error Responses + +| Status Code | Description | +|-------------|------------------------------| +| 500 | Internal server error |