Files
LocalAI/swagger/docs.go
Ettore Di Giacinto e86ade54a6 feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp (#9654)
* feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp

Closes #1648.

OpenAI-style multipart endpoint that returns "who spoke when". Single
endpoint instead of the issue's three-endpoint sketch (refactor /vad,
/vad/embedding, /diarization) — the typical client wants one call, and
embeddings can land later as a sibling without breaking this surface.

Response shape borrows from Pyannote/Deepgram: segments carry a
normalised SPEAKER_NN id (zero-padded, stable across the response) plus
the raw backend label, optional per-segment text when the backend bundles
ASR, and a speakers summary in verbose_json. response_format also accepts
rttm so consumers can pipe straight into pyannote.metrics / dscore.

Backends:

* vibevoice-cpp — Diarize() reuses the existing vv_capi_asr pass.
  vibevoice's ASR prompt asks the model to emit
  [{Start,End,Speaker,Content}] natively, so diarization is a by-product
  of the same pass; include_text=true preserves the transcript per
  segment, otherwise we drop it.

* sherpa-onnx — wraps the upstream SherpaOnnxOfflineSpeakerDiarization
  C API (pyannote segmentation + speaker-embedding extractor + fast
  clustering). libsherpa-shim grew config builders, a SetClustering
  wrapper for per-call num_clusters/threshold overrides, and a
  segment_at accessor (purego can't read field arrays out of
  SherpaOnnxOfflineSpeakerDiarizationSegment[] directly).

Plumbing: new Diarize gRPC RPC + DiarizeRequest / DiarizeSegment /
DiarizeResponse messages, threaded through interface.go, base, server,
client, embed. Default Base impl returns unimplemented.

Capability surfaces all updated: FLAG_DIARIZATION usecase,
FeatureAudioDiarization permission (default-on), RouteFeatureRegistry
entries for /v1/audio/diarization and /audio/diarization, audio
instruction-def description widened, CAP_DIARIZATION JS symbol,
swagger regenerated, /api/instructions discovery map updated.

Tests:

* core/backend: speaker-label normalisation (first-seen → SPEAKER_NN,
  per-speaker totals, nil-safety, fallback to backend NumSpeakers when
  no segments).

* core/http/endpoints/openai: RTTM rendering (file-id basename, negative
  duration clamping, fallback id).

* tests/e2e: mock-backend grew a deterministic Diarize that emits
  raw labels "5","2","5" so the e2e suite verifies SPEAKER_NN
  remapping, verbose_json speakers summary + transcript pass-through
  (gated by include_text), RTTM bytes content-type, and rejection of
  unknown response_format. mock-diarize model config registered with
  known_usecases=[FLAG_DIARIZATION] to bypass the backend-name guard.

Docs: new features/audio-diarization.md (request/response, RTTM example,
sherpa-onnx + vibevoice setup), cross-link from audio-to-text.md, entry
in whats-new.md.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]

* fix(diarization): correct sherpa-onnx symbol name + lint cleanup

CI failures on #9654:

* sherpa-onnx-grpc-{tts,transcription} and sherpa-onnx-realtime panicked
  at backend startup with `undefined symbol: SherpaOnnxDestroyOfflineSpeakerDiarizationResult`.
  Upstream's actual symbol is SherpaOnnxOfflineSpeakerDiarizationDestroyResult
  (Destroy in the middle, not the prefix); the rest of the diarization
  surface follows the same naming pattern. The mismatched name made
  purego.RegisterLibFunc fail at dlopen time and crashed the gRPC server
  before the BeforeAll could probe Health, taking down every sherpa-onnx
  test job — not just the diarization-related ones.

* golangci-lint flagged 5 errcheck violations on new defer cleanups
  (os.RemoveAll / Close / conn.Close); wrap each in a `defer func() { _ = X() }()`
  closure (matches the pattern other LocalAI files use for new code, since
  pre-existing bare defers are grandfathered in via new-from-merge-base).

* golangci-lint also flagged forbidigo violations: the new
  diarization_test.go files used testing.T-style `t.Errorf` / `t.Fatalf`,
  which are forbidden by the project's coding-style policy
  (.agents/coding-style.md). Convert both files to Ginkgo/Gomega
  Describe/It with Expect(...) — they get picked up by the existing
  TestBackend / TestOpenAI suites, no new suite plumbing needed.

* modernize linter: tightened the diarization segment loop to
  `for i := range int(numSegments)` (Go 1.22+ idiom).

Verified locally: golangci-lint with new-from-merge-base=origin/master
reports 0 issues across all touched packages, and the four mocked
diarization e2e specs in tests/e2e/mock_backend_test.go still pass.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]

* fix(vibevoice-cpp): convert non-WAV input via ffmpeg + raise ASR token budget

Confirmed end-to-end against a real LocalAI instance with vibevoice-asr-q4_k
loaded and the multi-speaker MP3 sample at vibevoice.cpp/samples/2p_argument.mp3:
both /v1/audio/transcriptions and /v1/audio/diarization now succeed and
return correctly attributed speaker turns for the full clip.

Two latent issues surfaced once the diarization endpoint actually exercised
the backend with a non-trivial input:

1. vv_capi_asr only accepts WAV via load_wav_24k_mono. The previous code
   passed the uploaded path straight through, so anything that wasn't
   already a 24 kHz mono s16le WAV failed at the C side with rc=-8 and
   the very unhelpful "vv_capi_asr failed". prepareWavInput shells out
   to ffmpeg ("-ar 24000 -ac 1 -acodec pcm_s16le") in a per-call temp
   dir, matching the rate the model was trained on; both AudioTranscription
   and Diarize now route through it. This is the same shape sherpa-onnx
   uses (utils.AudioToWav), but vibevoice needs 24 kHz rather than 16 kHz
   so we don't reuse that helper.

2. The C ABI's max_new_tokens defaults to 256 when 0 is passed. That's
   fine for a five-second clip but not for anything past ~10 s — vibevoice
   stops mid-JSON, the parse fails, and the caller sees a hard error.
   Pass a much larger budget (16 384 ≈ ~9 minutes of speech at the
   model's ~30 tok/s rate); generation stops at EOS so this is a cap
   rather than a target.

3. As a defensive belt-and-braces, mirror AudioTranscription's existing
   "fall back to a single segment if the model emits non-JSON text"
   pattern in Diarize, so partial / unusual model output never produces
   a 500. This kept the endpoint usable while diagnosing (1) and (2),
   and is the right behaviour to keep.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]

* fix(vibevoice-cpp): pass valid WAVs through directly so ffmpeg is not required at runtime

Spotted by tests-e2e-backend (1.25.x): the previous fix forced every
incoming audio file through `ffmpeg -ar 24000 ...`, which meant the
backend container — which does not ship ffmpeg — failed even for the
existing happy path where the caller already uploads a WAV. The
container-side error was:

    rpc error: code = Unknown desc = vibevoice-cpp: ffmpeg convert to
    24k mono wav: exec: "ffmpeg": executable file not found in $PATH

Reading vibevoice.cpp's audio_io.cpp, `load_wav_24k_mono` uses drwav and
already accepts any PCM/IEEE-float WAV at any sample rate, downmixes
multi-channel input to mono, and resamples to 24 kHz internally. So the
only inputs that genuinely need an external converter are non-WAV
formats (MP3, OGG, FLAC, ...).

Detect WAVs by RIFF/WAVE magic at bytes 0..3 / 8..11 and pass them
straight through with a no-op cleanup; everything else still goes
through ffmpeg with the same 24 kHz mono s16le target. The result:

* Container builds without ffmpeg keep working for WAV uploads
  (the e2e-backends fixture is jfk.wav at 16 kHz mono s16le).
* MP3 and other non-WAV inputs still get the new ffmpeg conversion
  path so the diarization endpoint stays useful.
* If the caller uploads a non-WAV but ffmpeg isn't on PATH, the
  surfaced error is still descriptive enough to act on.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]

* fix(ci): make gcc-14 install in Dockerfile.golang best-effort for jammy bases

The LocalVQE PR (bb033b16) made `gcc-14 g++-14` an unconditional apt
install in backend/Dockerfile.golang and pointed update-alternatives at
them. That works on the default `BASE_IMAGE=ubuntu:24.04` (noble has
gcc-14 in main), but every Go backend that builds on
`nvcr.io/nvidia/l4t-jetpack:r36.4.0` — jammy under the hood — now fails
at the apt step:

    E: Unable to locate package gcc-14

This blocked unrelated jobs:
backend-jobs(*-nvidia-l4t-arm64-{stablediffusion-ggml, sam3-cpp, whisper,
acestep-cpp, qwen3-tts-cpp, vibevoice-cpp}). LocalVQE itself is only
matrix-built on ubuntu:24.04 (CPU + Vulkan), so it doesn't actually
need gcc-14 anywhere else.

Make the gcc-14 install conditional on the package being available in
the configured apt repos. On noble: identical behaviour to today (gcc-14
installed, update-alternatives points at it). On jammy: skip the
gcc-14 stanza entirely and let build-essential's default gcc take over,
which is what the other Go backends compile with anyway.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-05 15:10:13 +02:00

6120 lines
202 KiB
Go

// Package swagger Code generated by swaggo/swag. DO NOT EDIT
package swagger
import "github.com/swaggo/swag"
const docTemplate = `{
"schemes": {{ marshal .Schemes }},
"swagger": "2.0",
"info": {
"description": "{{escape .Description}}",
"title": "{{.Title}}",
"contact": {
"name": "LocalAI",
"url": "https://localai.io"
},
"license": {
"name": "MIT",
"url": "https://raw.githubusercontent.com/mudler/LocalAI/master/LICENSE"
},
"version": "{{.Version}}"
},
"host": "{{.Host}}",
"basePath": "{{.BasePath}}",
"paths": {
"/api/agent/jobs": {
"get": {
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "List agent jobs",
"parameters": [
{
"type": "string",
"description": "Filter by task ID",
"name": "task_id",
"in": "query"
},
{
"type": "string",
"description": "Filter by status (pending, running, completed, failed, cancelled)",
"name": "status",
"in": "query"
},
{
"type": "integer",
"description": "Max number of jobs to return",
"name": "limit",
"in": "query"
},
{
"type": "string",
"description": "Set to 'true' for admin cross-user listing",
"name": "all_users",
"in": "query"
}
],
"responses": {
"200": {
"description": "jobs",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.Job"
}
}
}
}
}
},
"/api/agent/jobs/execute": {
"post": {
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Execute an agent job",
"parameters": [
{
"description": "Job execution request",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.JobExecutionRequest"
}
}
],
"responses": {
"201": {
"description": "job created",
"schema": {
"$ref": "#/definitions/schema.JobExecutionResponse"
}
},
"400": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/api/agent/jobs/{id}": {
"get": {
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Get an agent job",
"parameters": [
{
"type": "string",
"description": "Job ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "job",
"schema": {
"$ref": "#/definitions/schema.Job"
}
},
"404": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
},
"delete": {
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Delete an agent job",
"parameters": [
{
"type": "string",
"description": "Job ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "message",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"404": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/api/agent/jobs/{id}/cancel": {
"post": {
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Cancel an agent job",
"parameters": [
{
"type": "string",
"description": "Job ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "message",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"400": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"404": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/api/agent/tasks": {
"get": {
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "List agent tasks",
"parameters": [
{
"type": "string",
"description": "Set to 'true' for admin cross-user listing",
"name": "all_users",
"in": "query"
}
],
"responses": {
"200": {
"description": "tasks",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.Task"
}
}
}
}
},
"post": {
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Create a new agent task",
"parameters": [
{
"description": "Task definition",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.Task"
}
}
],
"responses": {
"201": {
"description": "id",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"400": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/api/agent/tasks/{id}": {
"get": {
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Get an agent task",
"parameters": [
{
"type": "string",
"description": "Task ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "task",
"schema": {
"$ref": "#/definitions/schema.Task"
}
},
"404": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
},
"put": {
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Update an agent task",
"parameters": [
{
"type": "string",
"description": "Task ID",
"name": "id",
"in": "path",
"required": true
},
{
"description": "Updated task definition",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.Task"
}
}
],
"responses": {
"200": {
"description": "message",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"400": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"404": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
},
"delete": {
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Delete an agent task",
"parameters": [
{
"type": "string",
"description": "Task ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "message",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"404": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/api/agent/tasks/{name}/execute": {
"post": {
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"agent-jobs"
],
"summary": "Execute an agent task by name",
"parameters": [
{
"type": "string",
"description": "Task name",
"name": "name",
"in": "path",
"required": true
},
{
"description": "Optional template parameters",
"name": "parameters",
"in": "body",
"schema": {
"type": "object"
}
}
],
"responses": {
"201": {
"description": "job created",
"schema": {
"$ref": "#/definitions/schema.JobExecutionResponse"
}
},
"400": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"404": {
"description": "error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/api/backend-logs": {
"get": {
"description": "Returns a sorted list of model IDs that have captured backend process output",
"produces": [
"application/json"
],
"tags": [
"monitoring"
],
"summary": "List models with backend logs",
"responses": {
"200": {
"description": "Model IDs with logs",
"schema": {
"type": "array",
"items": {
"type": "string"
}
}
}
}
}
},
"/api/backend-logs/{modelId}": {
"get": {
"description": "Returns all captured log lines (stdout/stderr) for the specified model's backend process",
"produces": [
"application/json"
],
"tags": [
"monitoring"
],
"summary": "Get backend logs for a model",
"parameters": [
{
"type": "string",
"description": "Model ID",
"name": "modelId",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "Log lines",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/model.BackendLogLine"
}
}
}
}
}
},
"/api/backend-logs/{modelId}/clear": {
"post": {
"description": "Removes all captured log lines for the specified model's backend process",
"tags": [
"monitoring"
],
"summary": "Clear backend logs for a model",
"parameters": [
{
"type": "string",
"description": "Model ID",
"name": "modelId",
"in": "path",
"required": true
}
],
"responses": {
"204": {
"description": "Logs cleared"
}
}
}
},
"/api/backend-traces": {
"get": {
"description": "Returns captured backend traces (LLM calls, embeddings, TTS, etc.) in reverse chronological order",
"produces": [
"application/json"
],
"tags": [
"monitoring"
],
"summary": "List backend operation traces",
"responses": {
"200": {
"description": "Backend operation traces",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/api/backend-traces/clear": {
"post": {
"description": "Removes all captured backend operation traces from the buffer",
"tags": [
"monitoring"
],
"summary": "Clear backend traces",
"responses": {
"204": {
"description": "Traces cleared"
}
}
}
},
"/api/branding": {
"get": {
"description": "Returns the configured instance name, tagline, and asset URLs. Public — no authentication required.",
"produces": [
"application/json"
],
"tags": [
"branding"
],
"summary": "Get instance branding",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/localai.BrandingResponse"
}
}
}
}
},
"/api/branding/asset/{kind}": {
"post": {
"description": "Upload a custom logo, horizontal logo, or favicon. The file replaces any previous override for that kind.",
"consumes": [
"multipart/form-data"
],
"produces": [
"application/json"
],
"tags": [
"branding"
],
"summary": "Upload a branding asset",
"parameters": [
{
"type": "string",
"description": "Asset kind: logo, logo_horizontal, or favicon",
"name": "kind",
"in": "path",
"required": true
},
{
"type": "file",
"description": "Image file (png, jpeg, svg, webp, ico — up to 5MiB)",
"name": "file",
"in": "formData",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/localai.BrandingResponse"
}
},
"400": {
"description": "Bad Request",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
},
"delete": {
"description": "Remove a custom branding asset; the UI falls back to the bundled LocalAI default.",
"produces": [
"application/json"
],
"tags": [
"branding"
],
"summary": "Reset a branding asset to default",
"parameters": [
{
"type": "string",
"description": "Asset kind: logo, logo_horizontal, or favicon",
"name": "kind",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/localai.BrandingResponse"
}
}
}
}
},
"/api/instructions": {
"get": {
"description": "Returns a compact list of instruction areas with descriptions and URLs for detailed guides",
"produces": [
"application/json"
],
"tags": [
"instructions"
],
"summary": "List available API instruction areas",
"responses": {
"200": {
"description": "instructions list with hint",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/api/instructions/{name}": {
"get": {
"description": "Returns a markdown guide (default) or filtered OpenAPI fragment (format=json) for a named instruction",
"produces": [
"application/json",
"text/markdown"
],
"tags": [
"instructions"
],
"summary": "Get an instruction's API guide or OpenAPI fragment",
"parameters": [
{
"type": "string",
"description": "Instruction name (e.g. chat-inference, config-management)",
"name": "name",
"in": "path",
"required": true
},
{
"type": "string",
"description": "Response format: json for OpenAPI fragment, omit for markdown",
"name": "format",
"in": "query"
}
],
"responses": {
"200": {
"description": "instruction documentation",
"schema": {
"$ref": "#/definitions/localai.APIInstructionResponse"
}
},
"404": {
"description": "instruction not found",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/api/models/config-json/{name}": {
"patch": {
"description": "Deep-merges the JSON patch body into the existing model config",
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"config"
],
"summary": "Partially update a model configuration",
"parameters": [
{
"type": "string",
"description": "Model name",
"name": "name",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "success message",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/api/models/config-metadata": {
"get": {
"description": "Returns config field metadata. Use ?section=\u003cid\u003e to filter by section, or omit for a section index.",
"produces": [
"application/json"
],
"tags": [
"config"
],
"summary": "List model configuration field metadata",
"parameters": [
{
"type": "string",
"description": "Section ID to filter (e.g. 'general', 'llm', 'parameters') or 'all' for everything",
"name": "section",
"in": "query"
}
],
"responses": {
"200": {
"description": "Section index or filtered field metadata",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/api/models/config-metadata/autocomplete/{provider}": {
"get": {
"description": "Returns runtime-resolved values for dynamic providers (backends, models)",
"produces": [
"application/json"
],
"tags": [
"config"
],
"summary": "Get dynamic autocomplete values for a config field",
"parameters": [
{
"type": "string",
"description": "Provider name (backends, models, models:chat, models:tts, models:transcript, models:vad)",
"name": "provider",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "values array",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/api/models/toggle-pinned/{name}/{action}": {
"put": {
"description": "Pin or unpin a model. Pinned models stay loaded and are excluded from automatic eviction.",
"tags": [
"config"
],
"summary": "Toggle model pinned status",
"parameters": [
{
"type": "string",
"description": "Model name",
"name": "name",
"in": "path",
"required": true
},
{
"type": "string",
"description": "Action: 'pin' or 'unpin'",
"name": "action",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
},
"400": {
"description": "Bad Request",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
},
"404": {
"description": "Not Found",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
}
}
}
},
"/api/models/vram-estimate": {
"post": {
"description": "Estimates VRAM based on model weight files, context size, and GPU layers",
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"config"
],
"summary": "Estimate VRAM usage for a model",
"parameters": [
{
"description": "VRAM estimation parameters",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/modeladmin.VRAMRequest"
}
}
],
"responses": {
"200": {
"description": "VRAM estimate",
"schema": {
"$ref": "#/definitions/modeladmin.VRAMResponse"
}
}
}
}
},
"/api/models/{name}/{action}": {
"put": {
"description": "Enable or disable a model from being loaded on demand. Disabled models remain installed but cannot be loaded.",
"tags": [
"config"
],
"summary": "Toggle model enabled/disabled status",
"parameters": [
{
"type": "string",
"description": "Model name",
"name": "name",
"in": "path",
"required": true
},
{
"type": "string",
"description": "Action: 'enable' or 'disable'",
"name": "action",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
},
"400": {
"description": "Bad Request",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
},
"404": {
"description": "Not Found",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/localai.ModelResponse"
}
}
}
}
},
"/api/nodes/{id}/max-replicas-per-model": {
"put": {
"tags": [
"Nodes"
],
"summary": "Update a node's max replicas per model",
"parameters": [
{
"type": "string",
"description": "Node ID",
"name": "id",
"in": "path",
"required": true
},
{
"description": "New value",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/localai.UpdateMaxReplicasPerModelRequest"
}
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "object",
"additionalProperties": {
"type": "integer"
}
}
},
"400": {
"description": "value must be \u003e= 1",
"schema": {
"type": "object",
"additionalProperties": true
}
},
"404": {
"description": "node not found",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
},
"delete": {
"tags": [
"Nodes"
],
"summary": "Reset a node's max replicas per model to the worker default",
"parameters": [
{
"type": "string",
"description": "Node ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "object",
"additionalProperties": {
"type": "boolean"
}
}
},
"404": {
"description": "node not found",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/api/p2p": {
"get": {
"tags": [
"p2p"
],
"summary": "Returns available P2P nodes",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.P2PNodesResponse"
}
}
}
}
}
},
"/api/p2p/token": {
"get": {
"tags": [
"p2p"
],
"summary": "Show the P2P token",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "string"
}
}
}
}
},
"/api/traces": {
"get": {
"description": "Returns captured API exchange traces (request/response pairs) in reverse chronological order",
"produces": [
"application/json"
],
"tags": [
"monitoring"
],
"summary": "List API request/response traces",
"responses": {
"200": {
"description": "Traced API exchanges",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/api/traces/clear": {
"post": {
"description": "Removes all captured API request/response traces from the buffer",
"tags": [
"monitoring"
],
"summary": "Clear API traces",
"responses": {
"204": {
"description": "Traces cleared"
}
}
}
},
"/audio/transform": {
"post": {
"description": "Runs an audio-in / audio-out transform conditioned on an optional auxiliary reference signal. Concrete transforms include AEC + noise suppression + dereverberation (LocalVQE), voice conversion (reference = target speaker), and pitch shifting. The backend determines the operation; pass model-specific tuning via repeated ` + "`" + `params[\u003ckey\u003e]=\u003cvalue\u003e` + "`" + ` form fields.",
"consumes": [
"multipart/form-data"
],
"produces": [
"audio/x-wav"
],
"tags": [
"audio"
],
"summary": "Transform audio (echo cancellation, noise suppression, voice conversion, etc.)",
"parameters": [
{
"type": "string",
"description": "model",
"name": "model",
"in": "formData",
"required": true
},
{
"type": "file",
"description": "primary input audio file",
"name": "audio",
"in": "formData",
"required": true
},
{
"type": "file",
"description": "auxiliary reference audio (loopback for AEC, target voice for conversion, etc.)",
"name": "reference",
"in": "formData"
},
{
"type": "string",
"description": "wav | mp3 | ogg | flac",
"name": "response_format",
"in": "formData"
},
{
"type": "integer",
"description": "desired output sample rate",
"name": "sample_rate",
"in": "formData"
}
],
"responses": {
"200": {
"description": "transformed audio file",
"schema": {
"type": "string"
}
}
}
}
},
"/audio/transformations": {
"post": {
"description": "Runs an audio-in / audio-out transform conditioned on an optional auxiliary reference signal. Concrete transforms include AEC + noise suppression + dereverberation (LocalVQE), voice conversion (reference = target speaker), and pitch shifting. The backend determines the operation; pass model-specific tuning via repeated ` + "`" + `params[\u003ckey\u003e]=\u003cvalue\u003e` + "`" + ` form fields.",
"consumes": [
"multipart/form-data"
],
"produces": [
"audio/x-wav"
],
"tags": [
"audio"
],
"summary": "Transform audio (echo cancellation, noise suppression, voice conversion, etc.)",
"parameters": [
{
"type": "string",
"description": "model",
"name": "model",
"in": "formData",
"required": true
},
{
"type": "file",
"description": "primary input audio file",
"name": "audio",
"in": "formData",
"required": true
},
{
"type": "file",
"description": "auxiliary reference audio (loopback for AEC, target voice for conversion, etc.)",
"name": "reference",
"in": "formData"
},
{
"type": "string",
"description": "wav | mp3 | ogg | flac",
"name": "response_format",
"in": "formData"
},
{
"type": "integer",
"description": "desired output sample rate",
"name": "sample_rate",
"in": "formData"
}
],
"responses": {
"200": {
"description": "transformed audio file",
"schema": {
"type": "string"
}
}
}
}
},
"/audio/transformations/stream": {
"get": {
"description": "Streams binary PCM frames in (interleaved stereo: ch0=audio, ch1=reference) and out (mono). The first message must be a JSON ` + "`" + `session.update` + "`" + ` envelope describing model + sample format + frame size + backend params. Server emits binary PCM on the same cadence.",
"tags": [
"audio"
],
"summary": "Bidirectional realtime audio transform over WebSocket.",
"responses": {}
}
},
"/backend/monitor": {
"get": {
"tags": [
"monitoring"
],
"summary": "Backend monitor endpoint",
"parameters": [
{
"type": "string",
"description": "Name of the model to monitor",
"name": "model",
"in": "query",
"required": true
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/proto.StatusResponse"
}
}
}
}
},
"/backend/shutdown": {
"post": {
"tags": [
"monitoring"
],
"summary": "Backend shutdown endpoint",
"parameters": [
{
"description": "Backend statistics request",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.BackendMonitorRequest"
}
}
],
"responses": {}
}
},
"/backends": {
"get": {
"tags": [
"backends"
],
"summary": "List all Backends",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/gallery.GalleryBackend"
}
}
}
}
}
},
"/backends/apply": {
"post": {
"tags": [
"backends"
],
"summary": "Install backends to LocalAI.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/localai.GalleryBackend"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.BackendResponse"
}
}
}
}
},
"/backends/available": {
"get": {
"tags": [
"backends"
],
"summary": "List all available Backends",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/gallery.GalleryBackend"
}
}
}
}
}
},
"/backends/delete/{name}": {
"post": {
"tags": [
"backends"
],
"summary": "delete backends from LocalAI.",
"parameters": [
{
"type": "string",
"description": "Backend name",
"name": "name",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.BackendResponse"
}
}
}
}
},
"/backends/galleries": {
"get": {
"tags": [
"backends"
],
"summary": "List all Galleries",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
}
},
"/backends/jobs": {
"get": {
"tags": [
"backends"
],
"summary": "Returns all the jobs status progress",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/galleryop.OpStatus"
}
}
}
}
}
},
"/backends/jobs/{uuid}": {
"get": {
"tags": [
"backends"
],
"summary": "Returns the job status",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/galleryop.OpStatus"
}
}
}
}
},
"/backends/known": {
"get": {
"tags": [
"backends"
],
"summary": "List all known Backends (importer registry + curated pref-only + installed-on-disk)",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.KnownBackend"
}
}
}
}
}
},
"/backends/upgrade/{name}": {
"post": {
"tags": [
"backends"
],
"summary": "Upgrade a backend",
"parameters": [
{
"type": "string",
"description": "Backend name",
"name": "name",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.BackendResponse"
}
}
}
}
},
"/backends/upgrades": {
"get": {
"tags": [
"backends"
],
"summary": "Get available backend upgrades",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/gallery.UpgradeInfo"
}
}
}
}
}
},
"/backends/upgrades/check": {
"post": {
"tags": [
"backends"
],
"summary": "Force backend upgrade check",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/gallery.UpgradeInfo"
}
}
}
}
}
},
"/branding/asset/{kind}": {
"get": {
"description": "Serves the admin-uploaded logo, horizontal logo, or favicon. 404 when no override is set.",
"produces": [
"image/*"
],
"tags": [
"branding"
],
"summary": "Serve a custom branding asset",
"parameters": [
{
"type": "string",
"description": "Asset kind: logo, logo_horizontal, or favicon",
"name": "kind",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK"
},
"404": {
"description": "Not Found"
}
}
}
},
"/metrics": {
"get": {
"produces": [
"text/plain"
],
"tags": [
"monitoring"
],
"summary": "Prometheus metrics endpoint",
"responses": {
"200": {
"description": "Prometheus metrics",
"schema": {
"type": "string"
}
}
}
}
},
"/models/apply": {
"post": {
"tags": [
"models"
],
"summary": "Install models to LocalAI.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/localai.GalleryModel"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.GalleryResponse"
}
}
}
}
},
"/models/available": {
"get": {
"tags": [
"models"
],
"summary": "List installable models.",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/gallery.Metadata"
}
}
}
}
}
},
"/models/delete/{name}": {
"post": {
"tags": [
"models"
],
"summary": "delete models to LocalAI.",
"parameters": [
{
"type": "string",
"description": "Model name",
"name": "name",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.GalleryResponse"
}
}
}
}
},
"/models/galleries": {
"get": {
"tags": [
"models"
],
"summary": "List all Galleries",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
}
},
"/models/jobs": {
"get": {
"tags": [
"models"
],
"summary": "Returns all the jobs status progress",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/galleryop.OpStatus"
}
}
}
}
}
},
"/models/jobs/{uuid}": {
"get": {
"tags": [
"models"
],
"summary": "Returns the job status",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/galleryop.OpStatus"
}
}
}
}
},
"/system": {
"get": {
"tags": [
"monitoring"
],
"summary": "Show the LocalAI instance information",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.SystemInformationResponse"
}
}
}
}
},
"/tokenMetrics": {
"get": {
"consumes": [
"application/json"
],
"produces": [
"audio/x-wav"
],
"tags": [
"tokenize"
],
"summary": "Get TokenMetrics for Active Slot.",
"responses": {
"200": {
"description": "generated audio/wav file",
"schema": {
"type": "string"
}
}
}
}
},
"/tts": {
"post": {
"consumes": [
"application/json"
],
"produces": [
"audio/x-wav"
],
"tags": [
"audio"
],
"summary": "Generates audio from the input text.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.TTSRequest"
}
}
],
"responses": {
"200": {
"description": "generated audio/wav file",
"schema": {
"type": "string"
}
}
}
}
},
"/v1/audio/diarization": {
"post": {
"consumes": [
"multipart/form-data"
],
"tags": [
"audio"
],
"summary": "Identify speakers in audio (who spoke when).",
"parameters": [
{
"type": "string",
"description": "model",
"name": "model",
"in": "formData",
"required": true
},
{
"type": "file",
"description": "audio file",
"name": "file",
"in": "formData",
"required": true
},
{
"type": "integer",
"description": "exact speaker count (\u003e0 forces; 0 = auto)",
"name": "num_speakers",
"in": "formData"
},
{
"type": "integer",
"description": "lower bound when auto-detecting",
"name": "min_speakers",
"in": "formData"
},
{
"type": "integer",
"description": "upper bound when auto-detecting",
"name": "max_speakers",
"in": "formData"
},
{
"type": "number",
"description": "clustering distance threshold when num_speakers is unknown",
"name": "clustering_threshold",
"in": "formData"
},
{
"type": "number",
"description": "discard segments shorter than this (seconds)",
"name": "min_duration_on",
"in": "formData"
},
{
"type": "number",
"description": "merge gaps shorter than this (seconds)",
"name": "min_duration_off",
"in": "formData"
},
{
"type": "string",
"description": "audio language hint (only meaningful for backends that bundle ASR)",
"name": "language",
"in": "formData"
},
{
"type": "boolean",
"description": "include per-segment transcript when the backend supports it",
"name": "include_text",
"in": "formData"
},
{
"type": "string",
"description": "json (default), verbose_json, or rttm",
"name": "response_format",
"in": "formData"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/schema.DiarizationResult"
}
}
}
}
},
"/v1/audio/speech": {
"post": {
"consumes": [
"application/json"
],
"produces": [
"audio/x-wav"
],
"tags": [
"audio"
],
"summary": "Generates audio from the input text.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.TTSRequest"
}
}
],
"responses": {
"200": {
"description": "generated audio/wav file",
"schema": {
"type": "string"
}
}
}
}
},
"/v1/audio/transcriptions": {
"post": {
"consumes": [
"multipart/form-data"
],
"tags": [
"audio"
],
"summary": "Transcribes audio into the input language.",
"parameters": [
{
"type": "string",
"description": "model",
"name": "model",
"in": "formData",
"required": true
},
{
"type": "file",
"description": "file",
"name": "file",
"in": "formData",
"required": true
},
{
"type": "number",
"description": "sampling temperature",
"name": "temperature",
"in": "formData"
},
{
"type": "array",
"items": {
"type": "string"
},
"collectionFormat": "csv",
"description": "timestamp granularities (word, segment)",
"name": "timestamp_granularities",
"in": "formData"
},
{
"type": "boolean",
"description": "stream partial results as SSE",
"name": "stream",
"in": "formData"
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/v1/chat/completions": {
"post": {
"tags": [
"inference"
],
"summary": "Generate a chat completions for a given prompt and model.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.OpenAIRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
}
}
}
},
"/v1/completions": {
"post": {
"tags": [
"inference"
],
"summary": "Generate completions for a given prompt and model.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.OpenAIRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
}
}
}
},
"/v1/detection": {
"post": {
"tags": [
"detection"
],
"summary": "Detects objects in the input image.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.DetectionRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.DetectionResponse"
}
}
}
}
},
"/v1/edits": {
"post": {
"tags": [
"inference"
],
"summary": "OpenAI edit endpoint",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.OpenAIRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
}
}
}
},
"/v1/embeddings": {
"post": {
"tags": [
"embeddings"
],
"summary": "Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.OpenAIRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
}
}
}
},
"/v1/face/analyze": {
"post": {
"tags": [
"face-recognition"
],
"summary": "Analyze demographic attributes (age, gender, ...) of faces.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.FaceAnalyzeRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.FaceAnalyzeResponse"
}
}
}
}
},
"/v1/face/embed": {
"post": {
"tags": [
"face-recognition"
],
"summary": "Extract a face embedding from an image.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.FaceEmbedRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.FaceEmbedResponse"
}
}
}
}
},
"/v1/face/forget": {
"post": {
"tags": [
"face-recognition"
],
"summary": "Remove a previously-registered face by ID.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.FaceForgetRequest"
}
}
],
"responses": {
"204": {
"description": "No Content"
}
}
}
},
"/v1/face/identify": {
"post": {
"tags": [
"face-recognition"
],
"summary": "Identify a face against the registered database (1:N recognition).",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.FaceIdentifyRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.FaceIdentifyResponse"
}
}
}
}
},
"/v1/face/register": {
"post": {
"tags": [
"face-recognition"
],
"summary": "Register a face for 1:N identification.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.FaceRegisterRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.FaceRegisterResponse"
}
}
}
}
},
"/v1/face/verify": {
"post": {
"tags": [
"face-recognition"
],
"summary": "Verify that two images depict the same person.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.FaceVerifyRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.FaceVerifyResponse"
}
}
}
}
},
"/v1/images/generations": {
"post": {
"tags": [
"images"
],
"summary": "Creates an image given a prompt.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.OpenAIRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
}
}
}
},
"/v1/images/inpainting": {
"post": {
"description": "Perform image inpainting. Accepts multipart/form-data with ` + "`" + `image` + "`" + ` and ` + "`" + `mask` + "`" + ` files.",
"consumes": [
"multipart/form-data"
],
"produces": [
"application/json"
],
"tags": [
"images"
],
"summary": "Image inpainting",
"parameters": [
{
"type": "string",
"description": "Model identifier",
"name": "model",
"in": "formData",
"required": true
},
{
"type": "string",
"description": "Text prompt guiding the generation",
"name": "prompt",
"in": "formData",
"required": true
},
{
"type": "integer",
"description": "Number of inference steps (default 25)",
"name": "steps",
"in": "formData"
},
{
"type": "file",
"description": "Original image file",
"name": "image",
"in": "formData",
"required": true
},
{
"type": "file",
"description": "Mask image file (white = area to inpaint)",
"name": "mask",
"in": "formData",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
},
"400": {
"description": "Bad Request",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
}
},
"/v1/mcp/chat/completions": {
"post": {
"tags": [
"mcp"
],
"summary": "MCP chat completions with automatic tool execution",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.OpenAIRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
}
}
}
},
"/v1/messages": {
"post": {
"tags": [
"inference"
],
"summary": "Generate a message response for the given messages and model.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.AnthropicRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.AnthropicResponse"
}
}
}
}
},
"/v1/models": {
"get": {
"tags": [
"models"
],
"summary": "List and describe the various models available in the API.",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.ModelsDataResponse"
}
}
}
}
},
"/v1/rerank": {
"post": {
"tags": [
"rerank"
],
"summary": "Reranks a list of phrases by relevance to a given text query.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.JINARerankRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.JINARerankResponse"
}
}
}
}
},
"/v1/responses": {
"post": {
"tags": [
"inference"
],
"summary": "Create a response using the Open Responses API",
"parameters": [
{
"description": "Request body",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.OpenResponsesRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.ORResponseResource"
}
}
}
}
},
"/v1/responses/{id}": {
"get": {
"description": "Retrieve a response by ID. Can be used for polling background responses or resuming streaming responses.",
"tags": [
"inference"
],
"summary": "Get a response by ID",
"parameters": [
{
"type": "string",
"description": "Response ID",
"name": "id",
"in": "path",
"required": true
},
{
"type": "string",
"description": "Set to 'true' to resume streaming",
"name": "stream",
"in": "query"
},
{
"type": "integer",
"description": "Sequence number to resume from (for streaming)",
"name": "starting_after",
"in": "query"
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.ORResponseResource"
}
},
"400": {
"description": "Bad Request",
"schema": {
"type": "object",
"additionalProperties": true
}
},
"404": {
"description": "Not Found",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/v1/responses/{id}/cancel": {
"post": {
"description": "Cancel a background response if it's still in progress",
"tags": [
"inference"
],
"summary": "Cancel a response",
"parameters": [
{
"type": "string",
"description": "Response ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.ORResponseResource"
}
},
"400": {
"description": "Bad Request",
"schema": {
"type": "object",
"additionalProperties": true
}
},
"404": {
"description": "Not Found",
"schema": {
"type": "object",
"additionalProperties": true
}
}
}
}
},
"/v1/sound-generation": {
"post": {
"tags": [
"audio"
],
"summary": "Generates audio from the input text.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.ElevenLabsSoundGenerationRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "string"
}
}
}
}
},
"/v1/text-to-speech/{voice-id}": {
"post": {
"tags": [
"audio"
],
"summary": "Generates audio from the input text.",
"parameters": [
{
"type": "string",
"description": "Account ID",
"name": "voice-id",
"in": "path",
"required": true
},
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.TTSRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "string"
}
}
}
}
},
"/v1/tokenMetrics": {
"get": {
"consumes": [
"application/json"
],
"produces": [
"audio/x-wav"
],
"tags": [
"tokenize"
],
"summary": "Get TokenMetrics for Active Slot.",
"responses": {
"200": {
"description": "generated audio/wav file",
"schema": {
"type": "string"
}
}
}
}
},
"/v1/tokenize": {
"post": {
"tags": [
"tokenize"
],
"summary": "Tokenize the input.",
"parameters": [
{
"description": "Request",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.TokenizeRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.TokenizeResponse"
}
}
}
}
},
"/v1/voice/analyze": {
"post": {
"tags": [
"voice-recognition"
],
"summary": "Analyze demographic attributes (age, gender, emotion) from a voice clip.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VoiceAnalyzeRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.VoiceAnalyzeResponse"
}
}
}
}
},
"/v1/voice/embed": {
"post": {
"tags": [
"voice-recognition"
],
"summary": "Extract a speaker embedding from an audio clip.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VoiceEmbedRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.VoiceEmbedResponse"
}
}
}
}
},
"/v1/voice/forget": {
"post": {
"tags": [
"voice-recognition"
],
"summary": "Remove a previously-registered speaker by ID.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VoiceForgetRequest"
}
}
],
"responses": {
"204": {
"description": "No Content"
}
}
}
},
"/v1/voice/identify": {
"post": {
"tags": [
"voice-recognition"
],
"summary": "Identify a speaker against the registered database (1:N recognition).",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VoiceIdentifyRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.VoiceIdentifyResponse"
}
}
}
}
},
"/v1/voice/register": {
"post": {
"tags": [
"voice-recognition"
],
"summary": "Register a speaker for 1:N identification.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VoiceRegisterRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.VoiceRegisterResponse"
}
}
}
}
},
"/v1/voice/verify": {
"post": {
"tags": [
"voice-recognition"
],
"summary": "Verify that two audio clips were spoken by the same person.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VoiceVerifyRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.VoiceVerifyResponse"
}
}
}
}
},
"/vad": {
"post": {
"consumes": [
"application/json"
],
"tags": [
"audio"
],
"summary": "Detect voice fragments in an audio stream",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VADRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/proto.VADResponse"
}
}
}
}
},
"/video": {
"post": {
"tags": [
"video"
],
"summary": "Creates a video given a prompt.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.VideoRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.OpenAIResponse"
}
}
}
}
},
"/ws/backend-logs/{modelId}": {
"get": {
"description": "Opens a WebSocket connection for real-time backend log streaming. Sends an initial batch of existing lines (type \"initial\"), then streams new lines as they appear (type \"line\"). Supports ping/pong keepalive.",
"tags": [
"monitoring"
],
"summary": "Stream backend logs via WebSocket",
"parameters": [
{
"type": "string",
"description": "Model ID",
"name": "modelId",
"in": "path",
"required": true
}
],
"responses": {}
}
}
},
"definitions": {
"config.Gallery": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"url": {
"type": "string"
}
}
},
"functions.Function": {
"type": "object",
"properties": {
"description": {
"type": "string"
},
"name": {
"type": "string"
},
"parameters": {
"type": "object",
"additionalProperties": {}
},
"strict": {
"type": "boolean"
}
}
},
"functions.Item": {
"type": "object",
"properties": {
"properties": {
"type": "object",
"additionalProperties": {}
},
"type": {
"type": "string"
}
}
},
"functions.JSONFunctionStructure": {
"type": "object",
"properties": {
"$defs": {
"type": "object",
"additionalProperties": {}
},
"anyOf": {
"type": "array",
"items": {
"$ref": "#/definitions/functions.Item"
}
},
"oneOf": {
"type": "array",
"items": {
"$ref": "#/definitions/functions.Item"
}
}
}
},
"functions.Tool": {
"type": "object",
"properties": {
"function": {
"$ref": "#/definitions/functions.Function"
},
"type": {
"type": "string"
}
}
},
"gallery.File": {
"type": "object",
"properties": {
"filename": {
"type": "string"
},
"sha256": {
"type": "string"
},
"uri": {
"type": "string"
}
}
},
"gallery.GalleryBackend": {
"type": "object",
"properties": {
"alias": {
"type": "string"
},
"backend": {
"description": "Backend is the resolved backend engine for this model (e.g. \"llama-cpp\").\nPopulated at load time from overrides, inline config, or the URL-referenced config file.",
"type": "string"
},
"capabilities": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"description": {
"type": "string"
},
"files": {
"description": "AdditionalFiles are used to add additional files to the model",
"type": "array",
"items": {
"$ref": "#/definitions/gallery.File"
}
},
"gallery": {
"description": "Gallery is a reference to the gallery which contains the model",
"allOf": [
{
"$ref": "#/definitions/config.Gallery"
}
]
},
"icon": {
"type": "string"
},
"installed": {
"description": "Installed is used to indicate if the model is installed or not",
"type": "boolean"
},
"license": {
"type": "string"
},
"mirrors": {
"type": "array",
"items": {
"type": "string"
}
},
"name": {
"type": "string"
},
"size": {
"description": "Size is an optional hardcoded model size string (e.g. \"500MB\", \"14.5GB\").\nUsed when the size cannot be estimated automatically.",
"type": "string"
},
"tags": {
"type": "array",
"items": {
"type": "string"
}
},
"uri": {
"type": "string"
},
"url": {
"type": "string"
},
"urls": {
"type": "array",
"items": {
"type": "string"
}
},
"version": {
"type": "string"
}
}
},
"gallery.Metadata": {
"type": "object",
"properties": {
"backend": {
"description": "Backend is the resolved backend engine for this model (e.g. \"llama-cpp\").\nPopulated at load time from overrides, inline config, or the URL-referenced config file.",
"type": "string"
},
"description": {
"type": "string"
},
"files": {
"description": "AdditionalFiles are used to add additional files to the model",
"type": "array",
"items": {
"$ref": "#/definitions/gallery.File"
}
},
"gallery": {
"description": "Gallery is a reference to the gallery which contains the model",
"allOf": [
{
"$ref": "#/definitions/config.Gallery"
}
]
},
"icon": {
"type": "string"
},
"installed": {
"description": "Installed is used to indicate if the model is installed or not",
"type": "boolean"
},
"license": {
"type": "string"
},
"name": {
"type": "string"
},
"size": {
"description": "Size is an optional hardcoded model size string (e.g. \"500MB\", \"14.5GB\").\nUsed when the size cannot be estimated automatically.",
"type": "string"
},
"tags": {
"type": "array",
"items": {
"type": "string"
}
},
"url": {
"type": "string"
},
"urls": {
"type": "array",
"items": {
"type": "string"
}
}
}
},
"gallery.NodeDriftInfo": {
"type": "object",
"properties": {
"digest": {
"type": "string"
},
"node_id": {
"type": "string"
},
"node_name": {
"type": "string"
},
"version": {
"type": "string"
}
}
},
"gallery.UpgradeInfo": {
"type": "object",
"properties": {
"available_digest": {
"type": "string"
},
"available_version": {
"type": "string"
},
"backend_name": {
"type": "string"
},
"installed_digest": {
"type": "string"
},
"installed_version": {
"type": "string"
},
"node_drift": {
"description": "NodeDrift lists nodes whose installed version or digest differs from\nthe cluster majority. Non-empty means the cluster has diverged and an\nupgrade will realign it. Empty in single-node mode.",
"type": "array",
"items": {
"$ref": "#/definitions/gallery.NodeDriftInfo"
}
}
}
},
"galleryop.OpStatus": {
"type": "object",
"properties": {
"cancellable": {
"description": "Cancellable is true if the operation can be cancelled",
"type": "boolean"
},
"cancelled": {
"description": "Cancelled is true if the operation was cancelled",
"type": "boolean"
},
"deletion": {
"description": "Deletion is true if the operation is a deletion",
"type": "boolean"
},
"downloaded_size": {
"type": "string"
},
"error": {},
"file_name": {
"type": "string"
},
"file_size": {
"type": "string"
},
"gallery_element_name": {
"type": "string"
},
"message": {
"type": "string"
},
"processed": {
"type": "boolean"
},
"progress": {
"type": "number"
}
}
},
"localai.APIInstructionResponse": {
"type": "object",
"properties": {
"description": {
"type": "string"
},
"name": {
"type": "string"
},
"swagger_fragment": {
"type": "object",
"additionalProperties": {}
},
"tags": {
"type": "array",
"items": {
"type": "string"
}
}
}
},
"localai.BrandingResponse": {
"type": "object",
"properties": {
"favicon_url": {
"type": "string"
},
"instance_name": {
"type": "string"
},
"instance_tagline": {
"type": "string"
},
"logo_horizontal_url": {
"type": "string"
},
"logo_url": {
"type": "string"
}
}
},
"localai.GalleryBackend": {
"type": "object",
"properties": {
"id": {
"type": "string"
}
}
},
"localai.GalleryModel": {
"type": "object",
"properties": {
"backend": {
"description": "Backend is the resolved backend engine for this model (e.g. \"llama-cpp\").\nPopulated at load time from overrides, inline config, or the URL-referenced config file.",
"type": "string"
},
"config_file": {
"description": "config_file is read in the situation where URL is blank - and therefore this is a base config.",
"type": "object",
"additionalProperties": {}
},
"description": {
"type": "string"
},
"files": {
"description": "AdditionalFiles are used to add additional files to the model",
"type": "array",
"items": {
"$ref": "#/definitions/gallery.File"
}
},
"gallery": {
"description": "Gallery is a reference to the gallery which contains the model",
"allOf": [
{
"$ref": "#/definitions/config.Gallery"
}
]
},
"icon": {
"type": "string"
},
"id": {
"type": "string"
},
"installed": {
"description": "Installed is used to indicate if the model is installed or not",
"type": "boolean"
},
"license": {
"type": "string"
},
"name": {
"type": "string"
},
"overrides": {
"description": "Overrides are used to override the configuration of the model located at URL",
"type": "object",
"additionalProperties": {}
},
"size": {
"description": "Size is an optional hardcoded model size string (e.g. \"500MB\", \"14.5GB\").\nUsed when the size cannot be estimated automatically.",
"type": "string"
},
"tags": {
"type": "array",
"items": {
"type": "string"
}
},
"url": {
"type": "string"
},
"urls": {
"type": "array",
"items": {
"type": "string"
}
}
}
},
"localai.ModelResponse": {
"type": "object",
"properties": {
"config": {},
"details": {
"type": "array",
"items": {
"type": "string"
}
},
"error": {
"type": "string"
},
"filename": {
"type": "string"
},
"message": {
"type": "string"
},
"success": {
"type": "boolean"
}
}
},
"localai.UpdateMaxReplicasPerModelRequest": {
"type": "object",
"properties": {
"value": {
"description": "Value is the new per-model replica cap on this node. Must be \u003e= 1.",
"type": "integer"
}
}
},
"model.BackendLogLine": {
"type": "object",
"properties": {
"stream": {
"description": "\"stdout\" or \"stderr\"",
"type": "string"
},
"text": {
"type": "string"
},
"timestamp": {
"type": "string"
}
}
},
"modeladmin.VRAMRequest": {
"type": "object",
"properties": {
"context_size": {
"type": "integer"
},
"gpu_layers": {
"type": "integer"
},
"kv_quant_bits": {
"type": "integer"
},
"model": {
"type": "string"
}
}
},
"modeladmin.VRAMResponse": {
"type": "object",
"properties": {
"context_note": {
"type": "string"
},
"model_max_context": {
"type": "integer"
},
"sizeBytes": {
"description": "total model weight size in bytes",
"type": "integer"
},
"sizeDisplay": {
"description": "human-readable size (e.g. \"4.2 GB\")",
"type": "string"
},
"vramBytes": {
"description": "estimated VRAM usage in bytes",
"type": "integer"
},
"vramDisplay": {
"description": "human-readable VRAM (e.g. \"6.1 GB\")",
"type": "string"
}
}
},
"proto.MemoryUsageData": {
"type": "object",
"properties": {
"breakdown": {
"type": "object",
"additionalProperties": {
"type": "integer",
"format": "int64"
}
},
"total": {
"type": "integer"
}
}
},
"proto.StatusResponse": {
"type": "object",
"properties": {
"memory": {
"$ref": "#/definitions/proto.MemoryUsageData"
},
"state": {
"$ref": "#/definitions/proto.StatusResponse_State"
}
}
},
"proto.StatusResponse_State": {
"type": "integer",
"format": "int32",
"enum": [
0,
1,
2,
-1
],
"x-enum-varnames": [
"StatusResponse_UNINITIALIZED",
"StatusResponse_BUSY",
"StatusResponse_READY",
"StatusResponse_ERROR"
]
},
"proto.VADResponse": {
"type": "object",
"properties": {
"segments": {
"type": "array",
"items": {
"$ref": "#/definitions/proto.VADSegment"
}
}
}
},
"proto.VADSegment": {
"type": "object",
"properties": {
"end": {
"type": "number"
},
"start": {
"type": "number"
}
}
},
"schema.AnthropicContentBlock": {
"type": "object",
"properties": {
"content": {},
"id": {
"type": "string"
},
"input": {
"type": "object",
"additionalProperties": {}
},
"is_error": {
"type": "boolean"
},
"name": {
"type": "string"
},
"source": {
"$ref": "#/definitions/schema.AnthropicImageSource"
},
"text": {
"type": "string"
},
"tool_use_id": {
"type": "string"
},
"type": {
"type": "string"
}
}
},
"schema.AnthropicImageSource": {
"type": "object",
"properties": {
"data": {
"type": "string"
},
"media_type": {
"type": "string"
},
"type": {
"type": "string"
}
}
},
"schema.AnthropicMessage": {
"type": "object",
"properties": {
"content": {},
"role": {
"type": "string"
}
}
},
"schema.AnthropicRequest": {
"type": "object",
"properties": {
"max_tokens": {
"type": "integer"
},
"messages": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.AnthropicMessage"
}
},
"metadata": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"type": "string"
},
"stop_sequences": {
"type": "array",
"items": {
"type": "string"
}
},
"stream": {
"type": "boolean"
},
"system": {
"type": "string"
},
"temperature": {
"type": "number"
},
"tool_choice": {},
"tools": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.AnthropicTool"
}
},
"top_k": {
"type": "integer"
},
"top_p": {
"type": "number"
}
}
},
"schema.AnthropicResponse": {
"type": "object",
"properties": {
"content": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.AnthropicContentBlock"
}
},
"id": {
"type": "string"
},
"model": {
"type": "string"
},
"role": {
"type": "string"
},
"stop_reason": {
"type": "string"
},
"stop_sequence": {
"type": "string"
},
"type": {
"type": "string"
},
"usage": {
"$ref": "#/definitions/schema.AnthropicUsage"
}
}
},
"schema.AnthropicTool": {
"type": "object",
"properties": {
"description": {
"type": "string"
},
"input_schema": {
"type": "object",
"additionalProperties": {}
},
"name": {
"type": "string"
}
}
},
"schema.AnthropicUsage": {
"type": "object",
"properties": {
"input_tokens": {
"type": "integer"
},
"output_tokens": {
"type": "integer"
}
}
},
"schema.BackendMonitorRequest": {
"type": "object",
"properties": {
"model": {
"type": "string"
}
}
},
"schema.BackendResponse": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"status_url": {
"type": "string"
}
}
},
"schema.Choice": {
"type": "object",
"properties": {
"delta": {
"$ref": "#/definitions/schema.Message"
},
"finish_reason": {
"type": "string"
},
"index": {
"type": "integer"
},
"logprobs": {
"$ref": "#/definitions/schema.Logprobs"
},
"message": {
"$ref": "#/definitions/schema.Message"
},
"text": {
"type": "string"
}
}
},
"schema.Detection": {
"type": "object",
"properties": {
"class_name": {
"type": "string"
},
"confidence": {
"type": "number"
},
"height": {
"type": "number"
},
"mask": {
"description": "base64-encoded PNG segmentation mask",
"type": "string"
},
"width": {
"type": "number"
},
"x": {
"type": "number"
},
"y": {
"type": "number"
}
}
},
"schema.DetectionRequest": {
"type": "object",
"properties": {
"boxes": {
"description": "Box coordinates as [x1,y1,x2,y2,...] quads",
"type": "array",
"items": {
"type": "number"
}
},
"image": {
"description": "URL or base64-encoded image to analyze",
"type": "string"
},
"model": {
"type": "string"
},
"points": {
"description": "Point coordinates as [x,y,label,...] triples (label: 1=pos, 0=neg)",
"type": "array",
"items": {
"type": "number"
}
},
"prompt": {
"description": "Text prompt (for SAM 3 PCS mode)",
"type": "string"
},
"threshold": {
"description": "Detection confidence threshold",
"type": "number"
}
}
},
"schema.DetectionResponse": {
"type": "object",
"properties": {
"detections": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.Detection"
}
}
}
},
"schema.DiarizationResult": {
"type": "object",
"properties": {
"duration": {
"type": "number"
},
"language": {
"type": "string"
},
"num_speakers": {
"type": "integer"
},
"segments": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.DiarizationSegment"
}
},
"speakers": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.DiarizationSpeaker"
}
},
"task": {
"type": "string"
}
}
},
"schema.DiarizationSegment": {
"type": "object",
"properties": {
"end": {
"type": "number"
},
"id": {
"type": "integer"
},
"label": {
"type": "string"
},
"speaker": {
"type": "string"
},
"start": {
"type": "number"
},
"text": {
"type": "string"
}
}
},
"schema.DiarizationSpeaker": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"label": {
"type": "string"
},
"segment_count": {
"type": "integer"
},
"total_speech_duration": {
"type": "number"
}
}
},
"schema.ElevenLabsSoundGenerationRequest": {
"type": "object",
"properties": {
"bpm": {
"type": "integer"
},
"caption": {
"type": "string"
},
"do_sample": {
"type": "boolean"
},
"duration_seconds": {
"type": "number"
},
"instrumental": {
"description": "Simple mode: use text as description; optional instrumental / vocal_language",
"type": "boolean"
},
"keyscale": {
"type": "string"
},
"language": {
"type": "string"
},
"lyrics": {
"type": "string"
},
"model_id": {
"type": "string"
},
"prompt_influence": {
"type": "number"
},
"text": {
"type": "string"
},
"think": {
"description": "Advanced mode",
"type": "boolean"
},
"timesignature": {
"type": "string"
},
"vocal_language": {
"type": "string"
}
}
},
"schema.FaceAnalysis": {
"type": "object",
"properties": {
"age": {
"type": "number"
},
"antispoof_score": {
"type": "number"
},
"dominant_emotion": {
"type": "string"
},
"dominant_gender": {
"type": "string"
},
"dominant_race": {
"type": "string"
},
"emotion": {
"type": "object",
"additionalProperties": {
"type": "number",
"format": "float32"
}
},
"face_confidence": {
"type": "number"
},
"gender": {
"type": "object",
"additionalProperties": {
"type": "number",
"format": "float32"
}
},
"is_real": {
"description": "Liveness fields — see FaceVerifyResponse for why these are pointers.",
"type": "boolean"
},
"race": {
"type": "object",
"additionalProperties": {
"type": "number",
"format": "float32"
}
},
"region": {
"$ref": "#/definitions/schema.FacialArea"
}
}
},
"schema.FaceAnalyzeRequest": {
"type": "object",
"properties": {
"actions": {
"description": "subset of {\"age\",\"gender\",\"emotion\",\"race\"}",
"type": "array",
"items": {
"type": "string"
}
},
"anti_spoofing": {
"type": "boolean"
},
"img": {
"type": "string"
},
"model": {
"type": "string"
}
}
},
"schema.FaceAnalyzeResponse": {
"type": "object",
"properties": {
"faces": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.FaceAnalysis"
}
}
}
},
"schema.FaceEmbedRequest": {
"type": "object",
"properties": {
"img": {
"type": "string"
},
"model": {
"type": "string"
}
}
},
"schema.FaceEmbedResponse": {
"type": "object",
"properties": {
"dim": {
"type": "integer"
},
"embedding": {
"type": "array",
"items": {
"type": "number"
}
},
"model": {
"type": "string"
}
}
},
"schema.FaceForgetRequest": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"model": {
"type": "string"
},
"store": {
"type": "string"
}
}
},
"schema.FaceIdentifyMatch": {
"type": "object",
"properties": {
"confidence": {
"type": "number"
},
"distance": {
"type": "number"
},
"id": {
"type": "string"
},
"labels": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"match": {
"description": "true when distance \u003c= threshold",
"type": "boolean"
},
"name": {
"type": "string"
}
}
},
"schema.FaceIdentifyRequest": {
"type": "object",
"properties": {
"img": {
"type": "string"
},
"model": {
"type": "string"
},
"store": {
"type": "string"
},
"threshold": {
"description": "optional cutoff on distance",
"type": "number"
},
"top_k": {
"type": "integer"
}
}
},
"schema.FaceIdentifyResponse": {
"type": "object",
"properties": {
"matches": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.FaceIdentifyMatch"
}
}
}
},
"schema.FaceRegisterRequest": {
"type": "object",
"properties": {
"img": {
"type": "string"
},
"labels": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"type": "string"
},
"name": {
"type": "string"
},
"store": {
"description": "vector store model; empty = local-store default",
"type": "string"
}
}
},
"schema.FaceRegisterResponse": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
},
"registered_at": {
"type": "string"
}
}
},
"schema.FaceVerifyRequest": {
"type": "object",
"properties": {
"anti_spoofing": {
"type": "boolean"
},
"img1": {
"type": "string"
},
"img2": {
"type": "string"
},
"model": {
"type": "string"
},
"threshold": {
"type": "number"
}
}
},
"schema.FaceVerifyResponse": {
"type": "object",
"properties": {
"confidence": {
"type": "number"
},
"distance": {
"type": "number"
},
"img1_antispoof_score": {
"type": "number"
},
"img1_area": {
"$ref": "#/definitions/schema.FacialArea"
},
"img1_is_real": {
"description": "Liveness fields are only populated when the request set\nanti_spoofing=true. Pointers keep them fully absent from the\nJSON response otherwise, so callers can tell \"not checked\"\napart from \"checked and fake\" (which would collapse to zero\nvalues with plain bool+omitempty).",
"type": "boolean"
},
"img2_antispoof_score": {
"type": "number"
},
"img2_area": {
"$ref": "#/definitions/schema.FacialArea"
},
"img2_is_real": {
"type": "boolean"
},
"model": {
"type": "string"
},
"processing_time_ms": {
"type": "number"
},
"threshold": {
"type": "number"
},
"verified": {
"type": "boolean"
}
}
},
"schema.FacialArea": {
"type": "object",
"properties": {
"h": {
"type": "number"
},
"w": {
"type": "number"
},
"x": {
"type": "number"
},
"y": {
"type": "number"
}
}
},
"schema.FunctionCall": {
"type": "object",
"properties": {
"arguments": {
"type": "string"
},
"name": {
"type": "string"
}
}
},
"schema.GalleryResponse": {
"type": "object",
"properties": {
"estimated_size_bytes": {
"type": "integer"
},
"estimated_size_display": {
"type": "string"
},
"estimated_vram_bytes": {
"type": "integer"
},
"estimated_vram_display": {
"type": "string"
},
"status": {
"type": "string"
},
"uuid": {
"type": "string"
}
}
},
"schema.InputTokensDetails": {
"type": "object",
"properties": {
"image_tokens": {
"type": "integer"
},
"text_tokens": {
"type": "integer"
}
}
},
"schema.Item": {
"type": "object",
"properties": {
"b64_json": {
"type": "string"
},
"index": {
"type": "integer"
},
"object": {
"type": "string"
},
"url": {
"description": "Images",
"type": "string"
}
}
},
"schema.JINADocumentResult": {
"type": "object",
"properties": {
"document": {
"$ref": "#/definitions/schema.JINAText"
},
"index": {
"type": "integer"
},
"relevance_score": {
"type": "number"
}
}
},
"schema.JINARerankRequest": {
"type": "object",
"properties": {
"backend": {
"type": "string"
},
"documents": {
"type": "array",
"items": {
"type": "string"
}
},
"model": {
"type": "string"
},
"query": {
"type": "string"
},
"top_n": {
"type": "integer"
}
}
},
"schema.JINARerankResponse": {
"type": "object",
"properties": {
"model": {
"type": "string"
},
"results": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.JINADocumentResult"
}
},
"usage": {
"$ref": "#/definitions/schema.JINAUsageInfo"
}
}
},
"schema.JINAText": {
"type": "object",
"properties": {
"text": {
"type": "string"
}
}
},
"schema.JINAUsageInfo": {
"type": "object",
"properties": {
"prompt_tokens": {
"type": "integer"
},
"total_tokens": {
"type": "integer"
}
}
},
"schema.Job": {
"type": "object",
"properties": {
"audios": {
"description": "List of audio URLs or base64 strings",
"type": "array",
"items": {
"type": "string"
}
},
"completed_at": {
"type": "string"
},
"created_at": {
"type": "string"
},
"error": {
"description": "Error message if failed",
"type": "string"
},
"files": {
"description": "List of file URLs or base64 strings",
"type": "array",
"items": {
"type": "string"
}
},
"id": {
"description": "UUID",
"type": "string"
},
"images": {
"description": "Multimedia content (for manual execution)\nCan contain URLs or base64-encoded data URIs",
"type": "array",
"items": {
"type": "string"
}
},
"parameters": {
"description": "Template parameters",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"result": {
"description": "Agent response",
"type": "string"
},
"started_at": {
"type": "string"
},
"status": {
"description": "pending, running, completed, failed, cancelled",
"allOf": [
{
"$ref": "#/definitions/schema.JobStatus"
}
]
},
"task_id": {
"description": "Reference to Task",
"type": "string"
},
"traces": {
"description": "Execution traces (reasoning, tool calls, tool results)",
"type": "array",
"items": {
"$ref": "#/definitions/schema.JobTrace"
}
},
"triggered_by": {
"description": "\"manual\", \"cron\", \"api\"",
"type": "string"
},
"videos": {
"description": "List of video URLs or base64 strings",
"type": "array",
"items": {
"type": "string"
}
},
"webhook_error": {
"description": "Error if webhook failed",
"type": "string"
},
"webhook_sent": {
"description": "Webhook delivery tracking",
"type": "boolean"
},
"webhook_sent_at": {
"type": "string"
}
}
},
"schema.JobExecutionRequest": {
"type": "object",
"properties": {
"audios": {
"description": "List of audio URLs or base64 strings",
"type": "array",
"items": {
"type": "string"
}
},
"files": {
"description": "List of file URLs or base64 strings",
"type": "array",
"items": {
"type": "string"
}
},
"images": {
"description": "Multimedia content (optional, for manual execution)\nCan contain URLs or base64-encoded data URIs",
"type": "array",
"items": {
"type": "string"
}
},
"parameters": {
"description": "Optional, for templating",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"task_id": {
"description": "Required",
"type": "string"
},
"videos": {
"description": "List of video URLs or base64 strings",
"type": "array",
"items": {
"type": "string"
}
}
}
},
"schema.JobExecutionResponse": {
"type": "object",
"properties": {
"job_id": {
"description": "unique job identifier",
"type": "string"
},
"status": {
"description": "initial status (pending)",
"type": "string"
},
"url": {
"description": "URL to poll for job status",
"type": "string"
}
}
},
"schema.JobStatus": {
"type": "string",
"enum": [
"pending",
"running",
"completed",
"failed",
"cancelled"
],
"x-enum-varnames": [
"JobStatusPending",
"JobStatusRunning",
"JobStatusCompleted",
"JobStatusFailed",
"JobStatusCancelled"
]
},
"schema.JobTrace": {
"type": "object",
"properties": {
"arguments": {
"description": "Tool arguments or result data",
"type": "object",
"additionalProperties": {}
},
"content": {
"description": "The actual trace content",
"type": "string"
},
"timestamp": {
"description": "When this trace occurred",
"type": "string"
},
"tool_name": {
"description": "Tool name (for tool_call/tool_result)",
"type": "string"
},
"type": {
"description": "\"reasoning\", \"tool_call\", \"tool_result\", \"status\"",
"type": "string"
}
}
},
"schema.KnownBackend": {
"type": "object",
"properties": {
"auto_detect": {
"type": "boolean"
},
"description": {
"type": "string"
},
"installed": {
"description": "Installed is true when the backend is currently present on disk — i.e. it\nappears in gallery.ListSystemBackends(systemState). Importer-registered or\ncurated pref-only backends default to false unless they also show up on\ndisk. The import form uses this to warn users that submitting an import\nmay trigger an automatic backend download.",
"type": "boolean"
},
"modality": {
"type": "string"
},
"name": {
"type": "string"
}
}
},
"schema.LogprobContent": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"id": {
"type": "integer"
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
},
"top_logprobs": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.LogprobContent"
}
}
}
},
"schema.Logprobs": {
"type": "object",
"properties": {
"content": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.LogprobContent"
}
}
}
},
"schema.LogprobsValue": {
"type": "object",
"properties": {
"enabled": {
"description": "true if logprobs should be returned",
"type": "boolean"
}
}
},
"schema.Message": {
"type": "object",
"properties": {
"content": {
"description": "The message content"
},
"function_call": {
"description": "A result of a function call"
},
"name": {
"description": "The message name (used for tools calls)",
"type": "string"
},
"reasoning": {
"description": "Reasoning content extracted from \u003cthinking\u003e...\u003c/thinking\u003e tags",
"type": "string"
},
"role": {
"description": "The message role",
"type": "string"
},
"string_audios": {
"type": "array",
"items": {
"type": "string"
}
},
"string_content": {
"type": "string"
},
"string_images": {
"type": "array",
"items": {
"type": "string"
}
},
"string_videos": {
"type": "array",
"items": {
"type": "string"
}
},
"tool_call_id": {
"type": "string"
},
"tool_calls": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.ToolCall"
}
}
}
},
"schema.ModelsDataResponse": {
"type": "object",
"properties": {
"data": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.OpenAIModel"
}
},
"object": {
"type": "string"
}
}
},
"schema.MultimediaSourceConfig": {
"type": "object",
"properties": {
"headers": {
"description": "Custom headers for HTTP request (e.g., Authorization)",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"type": {
"description": "\"image\", \"video\", \"audio\", \"file\"",
"type": "string"
},
"url": {
"description": "URL to fetch from",
"type": "string"
}
}
},
"schema.NodeData": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"lastSeen": {
"type": "string"
},
"name": {
"type": "string"
},
"serviceID": {
"type": "string"
},
"tunnelAddress": {
"type": "string"
}
}
},
"schema.ORAnnotation": {
"type": "object",
"properties": {
"end_index": {
"type": "integer"
},
"start_index": {
"type": "integer"
},
"title": {
"type": "string"
},
"type": {
"description": "url_citation",
"type": "string"
},
"url": {
"type": "string"
}
}
},
"schema.ORContentPart": {
"type": "object",
"properties": {
"annotations": {
"description": "REQUIRED for output_text - must always be present (use [])",
"type": "array",
"items": {
"$ref": "#/definitions/schema.ORAnnotation"
}
},
"detail": {
"description": "low|high|auto for images",
"type": "string"
},
"file_data": {
"type": "string"
},
"file_url": {
"type": "string"
},
"filename": {
"type": "string"
},
"image_url": {
"type": "string"
},
"logprobs": {
"description": "REQUIRED for output_text - must always be present (use [])",
"type": "array",
"items": {
"$ref": "#/definitions/schema.ORLogProb"
}
},
"refusal": {
"type": "string"
},
"text": {
"description": "REQUIRED for output_text - must always be present (even if empty)",
"type": "string"
},
"type": {
"description": "input_text|input_image|input_file|output_text|refusal",
"type": "string"
}
}
},
"schema.ORError": {
"type": "object",
"properties": {
"code": {
"type": "string"
},
"message": {
"type": "string"
},
"param": {
"type": "string"
},
"type": {
"description": "invalid_request|not_found|server_error|model_error|too_many_requests",
"type": "string"
}
}
},
"schema.ORFunctionTool": {
"type": "object",
"properties": {
"description": {
"type": "string"
},
"name": {
"type": "string"
},
"parameters": {
"type": "object",
"additionalProperties": {}
},
"strict": {
"description": "Always include in response",
"type": "boolean"
},
"type": {
"description": "always \"function\"",
"type": "string"
}
}
},
"schema.ORIncompleteDetails": {
"type": "object",
"properties": {
"reason": {
"type": "string"
}
}
},
"schema.ORInputTokensDetails": {
"type": "object",
"properties": {
"cached_tokens": {
"description": "Always include, even if 0",
"type": "integer"
}
}
},
"schema.ORItemField": {
"type": "object",
"properties": {
"arguments": {
"type": "string"
},
"call_id": {
"description": "Function call fields",
"type": "string"
},
"content": {
"description": "string or []ORContentPart for messages"
},
"encrypted_content": {
"description": "Provider-specific encrypted content",
"type": "string"
},
"id": {
"description": "Present for all output items",
"type": "string"
},
"name": {
"type": "string"
},
"output": {
"description": "Function call output fields"
},
"role": {
"description": "Message fields",
"type": "string"
},
"status": {
"description": "in_progress|completed|incomplete",
"type": "string"
},
"summary": {
"description": "Reasoning fields (for type == \"reasoning\")",
"type": "array",
"items": {
"$ref": "#/definitions/schema.ORContentPart"
}
},
"type": {
"description": "message|function_call|function_call_output|reasoning|item_reference",
"type": "string"
}
}
},
"schema.ORLogProb": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
},
"top_logprobs": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.ORTopLogProb"
}
}
}
},
"schema.OROutputTokensDetails": {
"type": "object",
"properties": {
"reasoning_tokens": {
"description": "Always include, even if 0",
"type": "integer"
}
}
},
"schema.ORReasoning": {
"type": "object",
"properties": {
"effort": {
"type": "string"
},
"summary": {
"type": "string"
}
}
},
"schema.ORReasoningParam": {
"type": "object",
"properties": {
"effort": {
"description": "\"none\"|\"low\"|\"medium\"|\"high\"|\"xhigh\"",
"type": "string"
},
"summary": {
"description": "\"auto\"|\"concise\"|\"detailed\"",
"type": "string"
}
}
},
"schema.ORResponseResource": {
"type": "object",
"properties": {
"background": {
"type": "boolean"
},
"completed_at": {
"description": "Required: present as number or null",
"type": "integer"
},
"created_at": {
"type": "integer"
},
"error": {
"description": "Always present, null if no error",
"allOf": [
{
"$ref": "#/definitions/schema.ORError"
}
]
},
"frequency_penalty": {
"type": "number"
},
"id": {
"type": "string"
},
"incomplete_details": {
"description": "Always present, null if complete",
"allOf": [
{
"$ref": "#/definitions/schema.ORIncompleteDetails"
}
]
},
"instructions": {
"type": "string"
},
"max_output_tokens": {
"type": "integer"
},
"max_tool_calls": {
"description": "nullable",
"type": "integer"
},
"metadata": {
"description": "Metadata and operational flags",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"type": "string"
},
"object": {
"description": "always \"response\"",
"type": "string"
},
"output": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.ORItemField"
}
},
"parallel_tool_calls": {
"type": "boolean"
},
"presence_penalty": {
"type": "number"
},
"previous_response_id": {
"type": "string"
},
"prompt_cache_key": {
"description": "nullable",
"type": "string"
},
"reasoning": {
"description": "nullable",
"allOf": [
{
"$ref": "#/definitions/schema.ORReasoning"
}
]
},
"safety_identifier": {
"description": "Safety and caching",
"type": "string"
},
"service_tier": {
"type": "string"
},
"status": {
"description": "in_progress|completed|failed|incomplete",
"type": "string"
},
"store": {
"type": "boolean"
},
"temperature": {
"description": "Sampling parameters (always required)",
"type": "number"
},
"text": {
"description": "Text format configuration",
"allOf": [
{
"$ref": "#/definitions/schema.ORTextConfig"
}
]
},
"tool_choice": {},
"tools": {
"description": "Tool-related fields",
"type": "array",
"items": {
"$ref": "#/definitions/schema.ORFunctionTool"
}
},
"top_logprobs": {
"description": "Default to 0",
"type": "integer"
},
"top_p": {
"type": "number"
},
"truncation": {
"description": "Truncation and reasoning",
"type": "string"
},
"usage": {
"description": "Usage statistics",
"allOf": [
{
"$ref": "#/definitions/schema.ORUsage"
}
]
}
}
},
"schema.ORTextConfig": {
"type": "object",
"properties": {
"format": {
"$ref": "#/definitions/schema.ORTextFormat"
}
}
},
"schema.ORTextFormat": {
"type": "object",
"properties": {
"type": {
"description": "\"text\" or \"json_schema\"",
"type": "string"
}
}
},
"schema.ORTopLogProb": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
}
}
},
"schema.ORUsage": {
"type": "object",
"properties": {
"input_tokens": {
"type": "integer"
},
"input_tokens_details": {
"description": "Always present",
"allOf": [
{
"$ref": "#/definitions/schema.ORInputTokensDetails"
}
]
},
"output_tokens": {
"type": "integer"
},
"output_tokens_details": {
"description": "Always present",
"allOf": [
{
"$ref": "#/definitions/schema.OROutputTokensDetails"
}
]
},
"total_tokens": {
"type": "integer"
}
}
},
"schema.OpenAIModel": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"object": {
"type": "string"
}
}
},
"schema.OpenAIRequest": {
"type": "object",
"required": [
"file"
],
"properties": {
"backend": {
"type": "string"
},
"batch": {
"description": "Custom parameters - not present in the OpenAI API",
"type": "integer"
},
"clip_skip": {
"description": "Diffusers",
"type": "integer"
},
"echo": {
"type": "boolean"
},
"encoding_format": {
"description": "Embedding encoding format: \"float\" (default) or \"base64\" (OpenAI Node.js SDK default)",
"type": "string"
},
"file": {
"description": "whisper",
"type": "string"
},
"files": {
"description": "Multiple input images for img2img or inpainting",
"type": "array",
"items": {
"type": "string"
}
},
"frequency_penalty": {
"type": "number"
},
"function_call": {
"description": "might be a string or an object"
},
"functions": {
"description": "A list of available functions to call",
"type": "array",
"items": {
"$ref": "#/definitions/functions.Function"
}
},
"grammar": {
"description": "A grammar to constrain the LLM output",
"type": "string"
},
"grammar_json_functions": {
"$ref": "#/definitions/functions.JSONFunctionStructure"
},
"ignore_eos": {
"type": "boolean"
},
"input": {},
"instruction": {
"description": "Edit endpoint",
"type": "string"
},
"language": {
"description": "Also part of the OpenAI official spec",
"type": "string"
},
"logit_bias": {
"description": "Map of token IDs to bias values (-100 to 100)",
"type": "object",
"additionalProperties": {
"type": "number",
"format": "float64"
}
},
"logprobs": {
"description": "OpenAI API logprobs parameters\nlogprobs: boolean - if true, returns log probabilities of each output token\ntop_logprobs: integer 0-20 - number of most likely tokens to return at each token position",
"allOf": [
{
"$ref": "#/definitions/schema.LogprobsValue"
}
]
},
"max_tokens": {
"type": "integer"
},
"messages": {
"description": "Messages is read only by chat/completion API calls",
"type": "array",
"items": {
"$ref": "#/definitions/schema.Message"
}
},
"metadata": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"min_p": {
"type": "number"
},
"model": {
"type": "string"
},
"model_base_name": {
"type": "string"
},
"n": {
"description": "Also part of the OpenAI official spec. use it for returning multiple results",
"type": "integer"
},
"n_keep": {
"type": "integer"
},
"negative_prompt": {
"type": "string"
},
"negative_prompt_scale": {
"type": "number"
},
"presence_penalty": {
"type": "number"
},
"prompt": {
"description": "Prompt is read only by completion/image API calls"
},
"quality": {
"description": "Image (not supported by OpenAI)",
"type": "string"
},
"reasoning_effort": {
"type": "string"
},
"ref_images": {
"description": "Reference images for models that support them (e.g., Flux Kontext)",
"type": "array",
"items": {
"type": "string"
}
},
"repeat_last_n": {
"type": "integer"
},
"repeat_penalty": {
"type": "number"
},
"response_format": {
"description": "whisper/image"
},
"rope_freq_base": {
"type": "number"
},
"rope_freq_scale": {
"type": "number"
},
"seed": {
"type": "integer"
},
"size": {
"description": "image",
"type": "string"
},
"step": {
"type": "integer"
},
"stop": {},
"stream": {
"type": "boolean"
},
"temperature": {
"type": "number"
},
"tfz": {
"type": "number"
},
"tokenizer": {
"description": "RWKV (?)",
"type": "string"
},
"tool_choice": {},
"tools": {
"type": "array",
"items": {
"$ref": "#/definitions/functions.Tool"
}
},
"top_k": {
"type": "integer"
},
"top_logprobs": {
"description": "Number of top logprobs per token (0-20)",
"type": "integer"
},
"top_p": {
"description": "Common options between all the API calls, part of the OpenAI spec",
"type": "number"
},
"translate": {
"description": "Only for audio transcription",
"type": "boolean"
},
"typical_p": {
"type": "number"
}
}
},
"schema.OpenAIResponse": {
"type": "object",
"properties": {
"choices": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.Choice"
}
},
"created": {
"type": "integer"
},
"data": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.Item"
}
},
"id": {
"type": "string"
},
"model": {
"type": "string"
},
"object": {
"type": "string"
},
"usage": {
"$ref": "#/definitions/schema.OpenAIUsage"
}
}
},
"schema.OpenAIUsage": {
"type": "object",
"properties": {
"completion_tokens": {
"type": "integer"
},
"input_tokens": {
"description": "Fields for image generation API compatibility",
"type": "integer"
},
"input_tokens_details": {
"$ref": "#/definitions/schema.InputTokensDetails"
},
"output_tokens": {
"type": "integer"
},
"prompt_tokens": {
"type": "integer"
},
"timing_prompt_processing": {
"description": "Extra timing data, disabled by default as is't not a part of OpenAI specification",
"type": "number"
},
"timing_token_generation": {
"type": "number"
},
"total_tokens": {
"type": "integer"
}
}
},
"schema.OpenResponsesRequest": {
"type": "object",
"properties": {
"allowed_tools": {
"description": "Restrict which tools can be invoked",
"type": "array",
"items": {
"type": "string"
}
},
"background": {
"description": "Run request in background",
"type": "boolean"
},
"frequency_penalty": {
"description": "Frequency penalty (-2.0 to 2.0)",
"type": "number"
},
"include": {
"description": "What to include in response",
"type": "array",
"items": {
"type": "string"
}
},
"input": {
"description": "string or []ORItemParam"
},
"instructions": {
"type": "string"
},
"logit_bias": {
"description": "OpenAI-compatible extensions (not in Open Responses spec)",
"type": "object",
"additionalProperties": {
"type": "number",
"format": "float64"
}
},
"max_output_tokens": {
"type": "integer"
},
"max_tool_calls": {
"description": "Maximum number of tool calls",
"type": "integer"
},
"metadata": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"type": "string"
},
"parallel_tool_calls": {
"description": "Allow parallel tool calls",
"type": "boolean"
},
"presence_penalty": {
"description": "Presence penalty (-2.0 to 2.0)",
"type": "number"
},
"previous_response_id": {
"type": "string"
},
"reasoning": {
"$ref": "#/definitions/schema.ORReasoningParam"
},
"service_tier": {
"description": "\"auto\"|\"default\"|priority hint",
"type": "string"
},
"store": {
"description": "Whether to store the response",
"type": "boolean"
},
"stream": {
"type": "boolean"
},
"temperature": {
"type": "number"
},
"text_format": {
"description": "Additional parameters from spec"
},
"tool_choice": {
"description": "\"auto\"|\"required\"|\"none\"|{type:\"function\",name:\"...\"}"
},
"tools": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.ORFunctionTool"
}
},
"top_logprobs": {
"description": "Number of top logprobs to return",
"type": "integer"
},
"top_p": {
"type": "number"
},
"truncation": {
"description": "\"auto\"|\"disabled\"",
"type": "string"
}
}
},
"schema.P2PNodesResponse": {
"type": "object",
"properties": {
"federated_nodes": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.NodeData"
}
},
"llama_cpp_nodes": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.NodeData"
}
},
"mlx_nodes": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.NodeData"
}
}
}
},
"schema.SysInfoModel": {
"type": "object",
"properties": {
"id": {
"type": "string"
}
}
},
"schema.SystemInformationResponse": {
"type": "object",
"properties": {
"backends": {
"description": "available backend engines",
"type": "array",
"items": {
"type": "string"
}
},
"loaded_models": {
"description": "currently loaded models",
"type": "array",
"items": {
"$ref": "#/definitions/schema.SysInfoModel"
}
}
}
},
"schema.TTSRequest": {
"description": "TTS request body",
"type": "object",
"properties": {
"backend": {
"description": "backend engine override",
"type": "string"
},
"input": {
"description": "text input",
"type": "string"
},
"language": {
"description": "(optional) language to use with TTS model",
"type": "string"
},
"model": {
"type": "string"
},
"response_format": {
"description": "(optional) output format",
"type": "string"
},
"sample_rate": {
"description": "(optional) desired output sample rate",
"type": "integer"
},
"stream": {
"description": "(optional) enable streaming TTS",
"type": "boolean"
},
"voice": {
"description": "voice audio file or speaker id",
"type": "string"
}
}
},
"schema.Task": {
"type": "object",
"properties": {
"created_at": {
"type": "string"
},
"cron": {
"description": "Optional cron expression",
"type": "string"
},
"cron_parameters": {
"description": "Parameters to use when executing cron jobs",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"description": {
"description": "Optional description",
"type": "string"
},
"enabled": {
"description": "Can be disabled without deletion",
"type": "boolean"
},
"id": {
"description": "UUID",
"type": "string"
},
"model": {
"description": "Model name (must have MCP config)",
"type": "string"
},
"multimedia_sources": {
"description": "Multimedia sources (for cron jobs)\nURLs to fetch multimedia content from when cron job executes\nEach source can have custom headers for authentication/authorization",
"type": "array",
"items": {
"$ref": "#/definitions/schema.MultimediaSourceConfig"
}
},
"name": {
"description": "User-friendly name",
"type": "string"
},
"prompt": {
"description": "Template prompt (supports Go template .param syntax)",
"type": "string"
},
"updated_at": {
"type": "string"
},
"webhooks": {
"description": "Webhook configuration (for notifications).\nSupports multiple webhook endpoints.\nWebhooks can handle both success and failure cases using template variables:\n.Job (Job object), .Task (Task object), .Result (if successful),\n.Error (if failed), .Status (job status string).",
"type": "array",
"items": {
"$ref": "#/definitions/schema.WebhookConfig"
}
}
}
},
"schema.TokenizeRequest": {
"type": "object",
"properties": {
"content": {
"description": "text to tokenize",
"type": "string"
},
"model": {
"type": "string"
}
}
},
"schema.TokenizeResponse": {
"type": "object",
"properties": {
"tokens": {
"description": "token IDs",
"type": "array",
"items": {
"type": "integer"
}
}
}
},
"schema.ToolCall": {
"type": "object",
"properties": {
"function": {
"$ref": "#/definitions/schema.FunctionCall"
},
"id": {
"type": "string"
},
"index": {
"type": "integer"
},
"type": {
"type": "string"
}
}
},
"schema.VADRequest": {
"description": "VAD request body",
"type": "object",
"properties": {
"audio": {
"description": "raw audio samples as float32 PCM",
"type": "array",
"items": {
"type": "number"
}
},
"model": {
"type": "string"
}
}
},
"schema.VideoRequest": {
"type": "object",
"properties": {
"cfg_scale": {
"description": "classifier-free guidance scale",
"type": "number"
},
"end_image": {
"description": "URL or base64 of the last frame",
"type": "string"
},
"fps": {
"description": "frames per second",
"type": "integer"
},
"height": {
"description": "output height in pixels",
"type": "integer"
},
"input_reference": {
"description": "reference image or video URL",
"type": "string"
},
"model": {
"type": "string"
},
"negative_prompt": {
"description": "things to avoid in the output",
"type": "string"
},
"num_frames": {
"description": "total number of frames to generate",
"type": "integer"
},
"prompt": {
"description": "text description of the video to generate",
"type": "string"
},
"response_format": {
"description": "output format (url or b64_json)",
"type": "string"
},
"seconds": {
"description": "duration in seconds (alternative to num_frames)",
"type": "string"
},
"seed": {
"description": "random seed for reproducibility",
"type": "integer"
},
"size": {
"description": "WxH shorthand (e.g. \"512x512\")",
"type": "string"
},
"start_image": {
"description": "URL or base64 of the first frame",
"type": "string"
},
"step": {
"description": "number of diffusion steps",
"type": "integer"
},
"width": {
"description": "output width in pixels",
"type": "integer"
}
}
},
"schema.VoiceAnalysis": {
"type": "object",
"properties": {
"age": {
"type": "number"
},
"dominant_emotion": {
"type": "string"
},
"dominant_gender": {
"type": "string"
},
"emotion": {
"type": "object",
"additionalProperties": {
"type": "number",
"format": "float32"
}
},
"end": {
"type": "number"
},
"gender": {
"type": "object",
"additionalProperties": {
"type": "number",
"format": "float32"
}
},
"start": {
"type": "number"
}
}
},
"schema.VoiceAnalyzeRequest": {
"type": "object",
"properties": {
"actions": {
"description": "subset of {\"age\",\"gender\",\"emotion\"}",
"type": "array",
"items": {
"type": "string"
}
},
"audio": {
"type": "string"
},
"model": {
"type": "string"
}
}
},
"schema.VoiceAnalyzeResponse": {
"type": "object",
"properties": {
"segments": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.VoiceAnalysis"
}
}
}
},
"schema.VoiceEmbedRequest": {
"type": "object",
"properties": {
"audio": {
"type": "string"
},
"model": {
"type": "string"
}
}
},
"schema.VoiceEmbedResponse": {
"type": "object",
"properties": {
"dim": {
"type": "integer"
},
"embedding": {
"type": "array",
"items": {
"type": "number"
}
},
"model": {
"type": "string"
}
}
},
"schema.VoiceForgetRequest": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"model": {
"type": "string"
},
"store": {
"type": "string"
}
}
},
"schema.VoiceIdentifyMatch": {
"type": "object",
"properties": {
"confidence": {
"type": "number"
},
"distance": {
"type": "number"
},
"id": {
"type": "string"
},
"labels": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"match": {
"type": "boolean"
},
"name": {
"type": "string"
}
}
},
"schema.VoiceIdentifyRequest": {
"type": "object",
"properties": {
"audio": {
"type": "string"
},
"model": {
"type": "string"
},
"store": {
"type": "string"
},
"threshold": {
"type": "number"
},
"top_k": {
"type": "integer"
}
}
},
"schema.VoiceIdentifyResponse": {
"type": "object",
"properties": {
"matches": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.VoiceIdentifyMatch"
}
}
}
},
"schema.VoiceRegisterRequest": {
"type": "object",
"properties": {
"audio": {
"type": "string"
},
"labels": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"type": "string"
},
"name": {
"type": "string"
},
"store": {
"type": "string"
}
}
},
"schema.VoiceRegisterResponse": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
},
"registered_at": {
"type": "string"
}
}
},
"schema.VoiceVerifyRequest": {
"type": "object",
"properties": {
"anti_spoofing": {
"type": "boolean"
},
"audio1": {
"type": "string"
},
"audio2": {
"type": "string"
},
"model": {
"type": "string"
},
"threshold": {
"type": "number"
}
}
},
"schema.VoiceVerifyResponse": {
"type": "object",
"properties": {
"confidence": {
"type": "number"
},
"distance": {
"type": "number"
},
"model": {
"type": "string"
},
"processing_time_ms": {
"type": "number"
},
"threshold": {
"type": "number"
},
"verified": {
"type": "boolean"
}
}
},
"schema.WebhookConfig": {
"type": "object",
"properties": {
"headers": {
"description": "Custom headers (e.g., Authorization)",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"method": {
"description": "HTTP method (POST, PUT, PATCH) - default: POST",
"type": "string"
},
"payload_template": {
"description": "Optional template for payload",
"type": "string"
},
"url": {
"description": "Webhook endpoint URL",
"type": "string"
}
}
}
},
"securityDefinitions": {
"BearerAuth": {
"type": "apiKey",
"name": "Authorization",
"in": "header"
}
}
}`
// SwaggerInfo holds exported Swagger Info so clients can modify it
var SwaggerInfo = &swag.Spec{
Version: "2.0.0",
Host: "",
BasePath: "/",
Schemes: []string{"http", "https"},
Title: "LocalAI API",
Description: "The LocalAI Rest API.",
InfoInstanceName: "swagger",
SwaggerTemplate: docTemplate,
LeftDelim: "{{",
RightDelim: "}}",
}
func init() {
swag.Register(SwaggerInfo.InstanceName(), SwaggerInfo)
}