mirror of
https://github.com/mudler/LocalAI.git
synced 2026-03-31 21:25:59 -04:00
* feat: add documentation for undocumented API endpoints Creates comprehensive documentation for 8 previously undocumented endpoints: - Voice Activity Detection (/v1/vad) - Video Generation (/video) - Sound Generation (/v1/sound-generation) - Backend Monitor (/backend/monitor, /backend/shutdown) - Token Metrics (/tokenMetrics) - P2P endpoints (/api/p2p/* - 5 sub-endpoints) - System Info (/system, /version) Each documentation file includes HTTP method, request/response schemas, curl examples, sample JSON responses, and error codes. * docs: remove token-metrics endpoint documentation per review feedback The token-metrics endpoint is not wired into the HTTP router and should not be documented per reviewer request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: move system-info documentation to reference section Per review feedback, system-info endpoint docs are better suited for the reference section rather than features. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: localai-bot <localai-bot@noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
4.4 KiB
4.4 KiB
+++ disableToc = false title = "Video Generation" weight = 18 url = "/features/video-generation/" +++
LocalAI can generate videos from text prompts and optional reference images via the /video endpoint. Supported backends include diffusers, stablediffusion, and vllm-omni.
API
- Method:
POST - Endpoint:
/video
Request
The request body is JSON with the following fields:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model |
string |
Yes | Model name to use | |
prompt |
string |
Yes | Text description of the video to generate | |
negative_prompt |
string |
No | What to exclude from the generated video | |
start_image |
string |
No | Starting image as base64 string or URL | |
end_image |
string |
No | Ending image for guided generation | |
width |
int |
No | 512 | Video width in pixels |
height |
int |
No | 512 | Video height in pixels |
num_frames |
int |
No | Number of frames | |
fps |
int |
No | Frames per second | |
seconds |
string |
No | Duration in seconds | |
size |
string |
No | Size specification (alternative to width/height) | |
input_reference |
string |
No | Input reference for the generation | |
seed |
int |
No | Random seed for reproducibility | |
cfg_scale |
float |
No | Classifier-free guidance scale | |
step |
int |
No | Number of inference steps | |
response_format |
string |
No | url |
url to return a file URL, b64_json for base64 output |
Response
Returns an OpenAI-compatible JSON response:
| Field | Type | Description |
|---|---|---|
created |
int |
Unix timestamp of generation |
id |
string |
Unique identifier (UUID) |
data |
array |
Array of generated video items |
data[].url |
string |
URL path to video file (if response_format is url) |
data[].b64_json |
string |
Base64-encoded video (if response_format is b64_json) |
Usage
Generate a video from a text prompt
curl http://localhost:8080/video \
-H "Content-Type: application/json" \
-d '{
"model": "video-model",
"prompt": "A cat playing in a garden on a sunny day",
"width": 512,
"height": 512,
"num_frames": 16,
"fps": 8
}'
Example response
{
"created": 1709900000,
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"data": [
{
"url": "/generated-videos/abc123.mp4"
}
]
}
Generate with a starting image
curl http://localhost:8080/video \
-H "Content-Type: application/json" \
-d '{
"model": "video-model",
"prompt": "A timelapse of flowers blooming",
"start_image": "https://example.com/flowers.jpg",
"num_frames": 24,
"fps": 12,
"seed": 42,
"cfg_scale": 7.5,
"step": 30
}'
Get base64-encoded output
curl http://localhost:8080/video \
-H "Content-Type: application/json" \
-d '{
"model": "video-model",
"prompt": "Ocean waves on a beach",
"response_format": "b64_json"
}'
Error Responses
| Status Code | Description |
|---|---|
| 400 | Missing or invalid model or request parameters |
| 500 | Backend error during video generation |