* feat(liquid-audio): add LFM2.5-Audio any-to-any backend + realtime_audio usecase
Wires LiquidAI's LFM2.5-Audio-1.5B as a self-contained Realtime API model:
single engine handles VAD, transcription, LLM, and TTS in one bidirectional
stream — drop-in alternative to a VAD+STT+LLM+TTS pipeline.
Backend
- backend/python/liquid-audio/ — new Python gRPC backend wrapping the
`liquid-audio` package. Modes: chat / asr / tts / s2s, voice presets,
Load/Predict/PredictStream/AudioTranscription/TTS/VAD/AudioToAudioStream/
Free and StartFineTune/FineTuneProgress/StopFineTune. Runtime monkey-patch
on `liquid_audio.utils.snapshot_download` so absolute local paths from
LocalAI's gallery resolve without a HF round-trip. soundfile in place of
torchaudio.load/save (torchcodec drags NVIDIA NPP we don't bundle).
- backend/backend.proto + pkg/grpc/{backend,client,server,base,embed,
interface}.go — new AudioToAudioStream RPC mirroring AudioTransformStream
(config/frame/control oneof in; typed event+pcm+meta out).
- core/services/nodes/{health_mock,inflight}_test.go — add stubs for the
new RPC to the test fakes.
Config + capabilities
- core/config/backend_capabilities.go — UsecaseRealtimeAudio, MethodAudio
ToAudioStream, UsecaseInfoMap entry, liquid-audio BackendCapability row.
- core/config/model_config.go — FLAG_REALTIME_AUDIO bitmask, ModalityGroups
membership in both speech-input and audio-output groups so a lone flag
still reads as multimodal, GetAllModelConfigUsecases entry, GuessUsecases
branch.
Realtime endpoint
- core/http/endpoints/openai/realtime.go — extract prepareRealtimeConfig()
so the gate is unit-testable; accept realtime_audio models and self-fill
empty pipeline slots with the model's own name (user-pinned slots win).
- core/http/endpoints/openai/realtime_gate_test.go — six specs covering nil
cfg, empty pipeline, legacy pipeline, self-contained realtime_audio,
user-pinned VAD slot, and partial legacy pipeline.
UI + endpoints
- core/http/routes/ui.go — /api/pipeline-models accepts either a legacy
VAD+STT+LLM+TTS pipeline or a realtime_audio model; surfaces a
self_contained flag so the Talk page can collapse the four cards.
- core/http/routes/ui_api.go — realtime_audio in usecaseFilters.
- core/http/routes/ui_pipeline_models_test.go — covers both code paths.
- core/http/react-ui/src/pages/Talk.jsx — self-contained badge instead of
the four-slot grid; rename Edit Pipeline → Edit Model Config; less
pipeline-specific wording.
- core/http/react-ui/src/pages/Models.jsx + locales/en/models.json — new
realtime_audio filter button + i18n.
- core/http/react-ui/src/utils/capabilities.js — CAP_REALTIME_AUDIO.
- core/http/react-ui/src/pages/FineTune.jsx — voice + validation-dataset
fields, surfaced when backend === liquid-audio, plumbed via
extra_options on submit/export/import.
Gallery + importer
- gallery/liquid-audio.yaml — config template with known_usecases:
[realtime_audio, chat, tts, transcript, vad].
- gallery/index.yaml — four model entries (realtime/chat/asr/tts) keyed by
mode option. Fixed pre-existing `transcribe` typo on the asr entry
(loader silently dropped the unknown string → entry never surfaced as a
transcript model).
- gallery/lfm.yaml — function block for the LFM2 Pythonic tool-call format
`<|tool_call_start|>[name(k="v")]<|tool_call_end|>` matching
common_chat_params_init_lfm2 in vendored llama.cpp.
- core/gallery/importers/{liquid-audio,liquid-audio_test}.go — detector
matches LFM2-Audio HF repos (excludes -gguf mirrors); mode/voice
preferences plumbed through to options.
- core/gallery/importers/importers.go — register LiquidAudioImporter
before LlamaCPPImporter.
- pkg/functions/parse_lfm2_test.go — seven specs for the response/argument
regex pair on the LFM2 pythonic format.
Build matrix
- .github/backend-matrix.yml — seven liquid-audio targets (cuda12, cuda13,
l4t-cuda-13, hipblas, intel, cpu amd64, cpu arm64). Jetpack r36 cuda-12
is skipped (Ubuntu 22.04 / Python 3.10 incompatible with liquid-audio's
3.12 floor).
- backend/index.yaml — anchor + 13 image entries.
- Makefile — .NOTPARALLEL, prepare-test-extra, test-extra,
docker-build-liquid-audio.
Docs
- .agents/plans/liquid-audio-integration.md — phased plan; PR-D (real
any-to-any wiring via AudioToAudioStream), PR-E (mid-audio tool-call
detector), PR-G (GGUF entries once upstream llama.cpp PR #18641 lands)
remain.
- .agents/api-endpoints-and-auth.md — expand the capability-surface
checklist with every place a new FLAG_* needs to be registered.
Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat(realtime): function calling + history cap for any-to-any models
Three pieces, all on the realtime_audio path that just landed:
1. liquid-audio backend (backend/python/liquid-audio/backend.py):
- _build_chat_state grows a `tools_prelude` arg.
- new _render_tools_prelude parses request.Tools (the OpenAI Chat
Completions function array realtime.go already serialises) and
emits an LFM2 `<|tool_list_start|>…<|tool_list_end|>` system turn
ahead of the user history. Mirrors gallery/lfm.yaml's `function:`
template so the model sees the same prompt shape whether served
via llama-cpp or here. Without this the backend silently dropped
tools — function calling was wired end-to-end on the Go side but
the model never saw a tool list.
2. Realtime history cap (core/http/endpoints/openai/realtime.go):
- Session grows MaxHistoryItems int; default picked by new
defaultMaxHistoryItems(cfg) — 6 for realtime_audio models (LFM2.5
1.5B degrades quickly past a handful of turns), 0/unlimited for
legacy pipelines composing larger LLMs.
- triggerResponse runs conv.Items through trimRealtimeItems before
building conversationHistory. Helper walks the cut left if it
would orphan a function_call_output, so tool result + call pairs
stay intact.
- realtime_gate_test.go: specs for defaultMaxHistoryItems and
trimRealtimeItems (zero cap, under cap, over cap, tool-call pair
preservation).
3. Talk page (core/http/react-ui/src/pages/Talk.jsx):
- Reuses the chat page's MCP plumbing — useMCPClient hook,
ClientMCPDropdown component, same auto-connect/disconnect effect
pattern. No bespoke tool registry, no new REST endpoints; tools
come from whichever MCP servers the user toggles on, exactly as
on the chat page.
- sendSessionUpdate now passes session.tools=getToolsForLLM(); the
update re-fires when the active server set changes mid-session.
- New response.function_call_arguments.done handler executes via
the hook's executeTool (which round-trips through the MCP client
SDK), then replies with conversation.item.create
{type:function_call_output} + response.create so the model
completes its turn with the tool output. Mirrors chat's
client-side agentic loop, translated to the realtime wire shape.
UI changes require a LocalAI image rebuild (Dockerfile:308-313 bakes
react-ui/dist into the runtime image). Backend.py changes can be
swapped live in /backends/<id>/backend.py + /backend/shutdown.
Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat(realtime): LocalAI Assistant ("Manage Mode") for the Talk page
Mirrors the chat-page metadata.localai_assistant flow so users can ask the
realtime model what's loaded / installed / configured. Tools are run
server-side via the same in-process MCP holder that powers the chat
modality — no transport switch, no proxy, no new wire protocol.
Wire:
- core/http/endpoints/openai/realtime.go:
- RealtimeSessionOptions{LocalAIAssistant,IsAdmin}; isCurrentUserAdmin
helper mirrors chat.go's requireAssistantAccess (no-op when auth
disabled, else requires auth.RoleAdmin).
- Session grows AssistantExecutor mcpTools.ToolExecutor.
- runRealtimeSession, when opts.LocalAIAssistant is set: gate on admin,
fail closed if DisableLocalAIAssistant or the holder has no tools,
DiscoverTools and inject into session.Tools, prepend
holder.SystemPrompt() to instructions.
- Tool-call dispatch loop: when AssistantExecutor.IsTool(name), run
ExecuteTool inproc, append a FunctionCallOutput to conv.Items, skip
the function_call_arguments client emit (the client can't execute
these — it doesn't know about them). After the loop, if any
assistant tool ran, trigger another response so the model speaks the
result. Mirrors chat's agentic loop, driven server-side rather than
via client round-trip.
- core/http/endpoints/openai/realtime_webrtc.go: RealtimeCallRequest
gains `localai_assistant` (JSON omitempty). Handshake calls
isCurrentUserAdmin and builds RealtimeSessionOptions.
- core/http/react-ui/src/pages/Talk.jsx: admin-only "Manage Mode"
checkbox under the Tools dropdown; passes localai_assistant: true to
realtimeApi.call's body, captured in the connect callback's deps.
Mirroring chat's pattern means the in-process MCP tools surface "just
works" for the Talk page without exposing a Streamable-HTTP MCP endpoint
(which was the alternative). Clients with their own MCP servers can
still use the existing ClientMCPDropdown path in parallel; the realtime
handler distinguishes them by AssistantExecutor.IsTool() at dispatch
time.
Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat(realtime): render Manage Mode tool calls in the Talk transcript
Previously the realtime endpoint only emitted response.output_item.added
for the FunctionCall item, and Talk.jsx's switch ignored the event — so
server-side tool runs were invisible in the UI. The model would speak
the result but the user had no way to see what tool was actually
called.
realtime.go: after executing an assistant tool inproc, emit a second
output_item.added/.done pair for the FunctionCallOutput item. Mirrors
the way the chat page displays tool_call + tool_result blocks.
Talk.jsx: handle both response.output_item.added and .done. Render
FunctionCall (with arguments) and FunctionCallOutput (pretty-printed
JSON when possible) as two transcript entries — `tool_call` with the
wrench icon, `tool_result` with the clipboard icon, both in mono-space
secondary-colour. Resets streamingRef after the result so the next
assistant text delta starts a fresh transcript entry instead of
appending to the previous turn.
Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* refactor(realtime): bound the Manage Mode tool-loop + preserve assistant tools
Fallout from a review pass on the Manage Mode patches:
- Bound the server-side agentic loop. triggerResponse used to recurse on
executedAssistantTool with no cap — a model that kept calling tools
would blow the goroutine stack. New maxAssistantToolTurns = 10 (mirrors
useChat.js's maxToolTurns). Public triggerResponse is now a thin shim
over triggerResponseAtTurn(toolTurn int); recursion increments the
counter and stops at the cap with an xlog.Warn.
- Preserve Manage Mode tools across client session.update. The handler
used to blindly overwrite session.Tools, so toggling a client MCP
server mid-session silently wiped the in-process admin tools. Session
now caches the original AssistantTools slice at session creation and
the session.update handler merges them back in (client names win on
collision — the client is explicit).
- strconv.ParseBool for the localai_assistant query param instead of
hand-rolled "1" || "true". Mirrors LocalAIAssistantFromMetadata.
- Talk.jsx: render both tool_call and tool_result on
response.output_item.done instead of splitting them across .added and
.done. The server's event pairing (added → done) stays correct; the
UI just doesn't need to inspect both phases of the same item. One
switch case instead of two, no behavioural change.
Out of scope (noted for follow-ups): extract a shared assistant-tools
helper between chat.go and realtime.go (duplication is small enough
that two parallel implementations stay readable for now), and an i18n
key for the Manage Mode helper text (Talk.jsx doesn't use i18n
anywhere else yet).
Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* ci(test-extra): wire liquid-audio backend smoke test
The backend ships test.py + a `make test` target and is listed in
backend-matrix.yml, so scripts/changed-backends.js already writes a
`liquid-audio=true|false` output when files under backend/python/liquid-audio/
change. The workflow just wasn't reading it.
- Expose the `liquid-audio` output on the detect-changes job
- Add a tests-liquid-audio job that runs `make` + `make test` in
backend/python/liquid-audio, gated on the per-backend detect flag
The smoke covers Health() and LoadModel(mode:finetune); fine-tune mode
short-circuits before any HuggingFace download (backend.py:192), so the
job needs neither weights nor a GPU. The full-inference path remains
gated on LIQUID_AUDIO_MODEL_ID, which CI doesn't set.
The four new Go test files (core/gallery/importers/liquid-audio_test.go,
core/http/endpoints/openai/realtime_gate_test.go,
core/http/routes/ui_pipeline_models_test.go, pkg/functions/parse_lfm2_test.go)
are already picked up by the existing test.yml workflow via `make test` →
`ginkgo -r ./pkg/... ./core/...`; their packages all carry RunSpecs entries.
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
16 KiB
API Endpoints and Authentication
This guide covers how to add new API endpoints and properly integrate them with the auth/permissions system.
Before you ship a new endpoint or capability surface, re-read the checklist at the bottom of this file. LocalAI advertises its feature surface in several independent places — miss any one of them and clients/admins/UI won't know the endpoint exists.
Architecture overview
Authentication and authorization flow through three layers:
- Global auth middleware (
core/http/auth/middleware.go→auth.Middleware) — applied to every request incore/http/app.go. Handles session cookies, Bearer tokens, API keys, and legacy API keys. Populatesauth_userandauth_rolein the Echo context. - Feature middleware (
auth.RequireFeature) — per-feature access control applied to route groups or individual routes. Checks if the authenticated user has the specific feature enabled. - Admin middleware (
auth.RequireAdmin) — restricts endpoints to admin users only.
When auth is disabled (no auth DB, no legacy API keys), all middleware becomes pass-through (auth.NoopMiddleware).
Adding a new API endpoint
Step 1: Create the handler
Write the endpoint handler in the appropriate package under core/http/endpoints/. Follow existing patterns:
// core/http/endpoints/localai/my_feature.go
func MyFeatureEndpoint(app *application.Application) echo.HandlerFunc {
return func(c echo.Context) error {
// Use auth.GetUser(c) to get the authenticated user (may be nil if auth is disabled)
user := auth.GetUser(c)
// Your logic here
return c.JSON(http.StatusOK, result)
}
}
Step 2: Register routes
Add routes in the appropriate file under core/http/routes/. The file you use depends on the endpoint category:
| File | Category |
|---|---|
routes/openai.go |
OpenAI-compatible API endpoints (/v1/...) |
routes/localai.go |
LocalAI-specific endpoints (/api/..., /models/..., /backends/...) |
routes/agents.go |
Agent pool endpoints (/api/agents/...) |
routes/auth.go |
Auth endpoints (/api/auth/...) |
routes/ui_api.go |
UI backend API endpoints |
Step 3: Apply the right middleware
Choose the appropriate protection level:
No auth required (public)
Exempt paths bypass auth entirely. Add to isExemptPath() in middleware.go or use the /api/auth/ prefix (always exempt). Use sparingly — most endpoints should require auth.
Standard auth (any authenticated user)
The global middleware already handles this. API paths (/api/, /v1/, etc.) automatically require authentication when auth is enabled. You don't need to add any extra middleware.
router.GET("/v1/my-endpoint", myHandler) // auth enforced by global middleware
Admin only
Pass adminMiddleware to the route. This is set up in app.go and passed to Register*Routes functions:
// In the Register function signature, accept the middleware:
func RegisterMyRoutes(router *echo.Echo, app *application.Application, adminMiddleware echo.MiddlewareFunc) {
router.POST("/models/apply", myHandler, adminMiddleware)
}
Feature-gated
For endpoints that should be toggleable per-user, use feature middleware. There are two approaches:
Approach A: Route-level middleware (preferred for groups of related endpoints)
// In app.go, create the feature middleware:
myFeatureMw := auth.RequireFeature(application.AuthDB(), auth.FeatureMyFeature)
// Pass it to the route registration function:
routes.RegisterMyRoutes(e, app, myFeatureMw)
// In the routes file, apply to a group:
g := e.Group("/api/my-feature", myFeatureMw)
g.GET("", listHandler)
g.POST("", createHandler)
Approach B: RouteFeatureRegistry (preferred for individual OpenAI-compatible endpoints)
Add an entry to RouteFeatureRegistry in core/http/auth/features.go. The RequireRouteFeature global middleware will automatically enforce it:
var RouteFeatureRegistry = []RouteFeature{
// ... existing entries ...
{"POST", "/v1/my-endpoint", FeatureMyFeature},
}
Adding a new feature
When you need a new toggleable feature (not just a new endpoint under an existing feature):
1. Define the feature constant
Add to core/http/auth/permissions.go:
const (
// Add to the appropriate group:
// Agent features (default OFF for new users)
FeatureMyFeature = "my_feature"
// OR API features (default ON for new users)
FeatureMyFeature = "my_feature"
)
Then add it to the appropriate slice:
// Default OFF — user must be explicitly granted access:
var AgentFeatures = []string{..., FeatureMyFeature}
// Default ON — user has access unless explicitly revoked:
var APIFeatures = []string{..., FeatureMyFeature}
2. Add feature metadata
In core/http/auth/features.go, add to the appropriate FeatureMetas function so the admin UI can display it:
func AgentFeatureMetas() []FeatureMeta {
return []FeatureMeta{
// ... existing ...
{FeatureMyFeature, "My Feature", false}, // false = default OFF
}
}
3. Wire up the middleware
In core/http/app.go:
myFeatureMw := auth.RequireFeature(application.AuthDB(), auth.FeatureMyFeature)
Then pass it to the route registration function.
4. Register route-feature mappings (if applicable)
If your feature gates standard API endpoints (like /v1/...), add entries to RouteFeatureRegistry in features.go instead of using per-route middleware.
Accessing the authenticated user in handlers
import "github.com/mudler/LocalAI/core/http/auth"
func MyHandler(c echo.Context) error {
// Get the user (nil when auth is disabled or unauthenticated)
user := auth.GetUser(c)
if user == nil {
// Handle unauthenticated — or let middleware handle it
}
// Check role
if user.Role == auth.RoleAdmin {
// admin-specific logic
}
// Check feature access programmatically (when you need conditional behavior, not full blocking)
if auth.HasFeatureAccess(db, user, auth.FeatureMyFeature) {
// feature-specific logic
}
// Check model access
if !auth.IsModelAllowed(db, user, modelName) {
return c.JSON(http.StatusForbidden, ...)
}
}
Middleware composition patterns
Middleware can be composed at different levels. Here are the patterns used in the codebase:
Group-level middleware (agents pattern)
// All routes in the group share the middleware
g := e.Group("/api/agents", poolReadyMw, agentsMw)
g.GET("", listHandler)
g.POST("", createHandler)
Per-route middleware (localai pattern)
// Individual routes get middleware as extra arguments
router.POST("/models/apply", applyHandler, adminMiddleware)
router.GET("/metrics", metricsHandler, adminMiddleware)
Middleware slice (openai pattern)
// Build a middleware chain for a handler
chatMiddleware := []echo.MiddlewareFunc{
usageMiddleware,
traceMiddleware,
modelFilterMiddleware,
}
app.POST("/v1/chat/completions", chatHandler, chatMiddleware...)
Error response format
Always use schema.ErrorResponse for auth/permission errors to stay consistent with the OpenAI-compatible API:
return c.JSON(http.StatusForbidden, schema.ErrorResponse{
Error: &schema.APIError{
Message: "feature not enabled for your account",
Code: http.StatusForbidden,
Type: "authorization_error",
},
})
Use these HTTP status codes:
401 Unauthorized— no valid credentials provided403 Forbidden— authenticated but lacking permission429 Too Many Requests— rate limited (auth endpoints)
Usage tracking
If your endpoint should be tracked for usage (token counts, request counts), add the usageMiddleware to its middleware chain. See core/http/middleware/usage.go and how it's applied in routes/openai.go.
Advertising surfaces — where to register a new capability
Beyond routing and auth, LocalAI publishes its capability surface in four independent places. When you add an endpoint — especially one introducing a net-new capability like a new media type or a new auth-gated feature — you must update every relevant surface. These aren't optional: missing them means the endpoint works but is invisible to clients, admins, and the UI.
1. Swagger @Tags annotation (mandatory)
Every handler needs a swagger block so the endpoint appears in /swagger/index.html and in the /api/instructions output. The @Tags value is what groups the endpoint into a capability area:
// MyEndpoint does X.
// @Summary Do X.
// @Tags my-capability
// @Param request body schema.MyRequest true "payload"
// @Success 200 {object} schema.MyResponse "Response"
// @Router /v1/my-endpoint [post]
func MyEndpoint(...) echo.HandlerFunc { ... }
Use an existing tag when the endpoint extends an existing area (e.g. audio, images, face-recognition). Create a new tag only when the endpoint introduces a genuinely new capability surface — and in that case, also register it in step 2.
After adding endpoints, regenerate the embedded spec so the runtime serves it:
make protogen-go # ensures gRPC codegen is fresh first
make swagger # regenerates swagger/swagger.json
2. /api/instructions registry (for new capability areas)
core/http/endpoints/localai/api_instructions.go defines instructionDefs — a lightweight, machine-readable index of capability areas that groups swagger endpoints by tag. It's the primary discovery surface for agents and SDKs ("what can this server do?").
When to update: only when adding a new capability area (a new swagger tag). Existing-tag additions automatically surface without any change here.
Add an entry to instructionDefs:
{
Name: "my-capability", // URL segment at /api/instructions/my-capability
Description: "Short sentence describing the capability",
Tags: []string{"my-capability"}, // must match swagger @Tags
Intro: "Optional gotcha/context that isn't in the swagger descriptions (caveats, defaults, cross-references to other endpoints).",
},
Also bump the expected-length count in api_instructions_test.go and add the name to the ContainElements assertion.
3. capabilities.js symbol (for new model-config FLAG_* flags)
If your feature needs a new FLAG_* usecase flag in core/config/model_config.go (so users can filter gallery models by it, and so /v1/models surfaces it), you need to update all of:
Usecase<Name>string constant incore/config/backend_capabilities.goUsecaseInfoMapentry mapping the string to its flag + gRPC methodFLAG_<NAME>bitmask incore/config/model_config.goGetAllModelConfigUsecases()map entry (otherwise the YAML loader silently ignores the string)ModalityGroupsmembership if the flag should affectIsMultimodal()(e.g. realtime_audio is in both speech-input and audio-output groups so a lone flag still reads as multimodal)GuessUsecases()branch listing the backends that own this capabilityusecaseFiltersincore/http/routes/ui_api.go(drives the gallery filter dropdown)Models.jsxFILTERSarray + matchingfilters.<camelCase>i18n key incore/http/react-ui/public/locales/en/models.jsoncore/http/react-ui/src/utils/capabilities.js:
export const CAP_MY_CAPABILITY = 'FLAG_MY_CAPABILITY'
React pages that want to filter the ModelSelector by capability import this symbol. Declare it even if you're not building the UI page yet — the declaration keeps the Go/JS vocabularies in sync.
4. docs/content/ (user-facing documentation)
A new capability deserves its own page under docs/content/features/, plus cross-links from related features and an entry in docs/content/whats-new.md. See the pattern used by face-recognition.md / object-detection.md.
Path protection rules
The global auth middleware classifies paths as API paths or non-API paths:
- API paths (always require auth when auth is enabled):
/api/,/v1/,/models/,/backends/,/backend/,/tts,/vad,/video,/stores/,/system,/ws/,/metrics - Exempt paths (never require auth):
/api/auth/prefix, anything inappConfig.PathWithoutAuth - Non-API paths (UI, static assets): pass through without auth — the React UI handles login redirects client-side
If you add endpoints under a new top-level path prefix, add it to isAPIPath() in middleware.go to ensure it requires authentication.
Checklist
When adding a new endpoint:
Routing & auth
- Handler in
core/http/endpoints/ - Route registered in appropriate
core/http/routes/file - Auth level chosen: public / standard / admin / feature-gated
- Entry added to
RouteFeatureRegistryincore/http/auth/features.go(one row per route/method — all /v1/* routes gate through this, not per-route middleware) - If new feature: constant in
permissions.go, added to the right slice (APIFeaturesdefault-ON /AgentFeaturesdefault-OFF), metadata infeatures.go*FeatureMetas() - If feature uses group middleware: wired in
core/http/app.goand passed to the route registration function - If new path prefix: added to
isAPIPath()inmiddleware.go - If token-counting:
usageMiddlewareadded to middleware chain
Advertising surfaces (easy to miss — see the Advertising surfaces section)
- Swagger block on the handler:
@Summary,@Tags,@Param,@Success,@Router - If new capability area (new swagger tag): entry in
instructionDefsincore/http/endpoints/localai/api_instructions.go+ test count bumped inapi_instructions_test.go - If new
FLAG_*usecase flag: matchingCAP_*symbol exported fromcore/http/react-ui/src/utils/capabilities.js docs/content/features/<feature>.mdcreated; cross-links from related feature pages; entry indocs/content/whats-new.md
Quality
- Error responses use
schema.ErrorResponseformat (orecho.NewHTTPErrorwith a mapped gRPC status — see themapBackendErrorhelper incore/http/endpoints/localai/images.go) - Tests cover both authenticated and unauthenticated access
- Swagger regenerated (
make swagger) if you changed any@Router/@Tags/@Paramannotation
Companion: MCP admin tool surface
Required for admin endpoints. Every new admin endpoint MUST be considered for the MCP admin tool surface — the REST API and the MCP tool catalog can drift silently otherwise, and both the LocalAI Assistant chat modality and the standalone local-ai mcp-server rely on pkg/mcp/localaitools/ to mirror REST.
Two outcomes are acceptable; one is not:
- Tool added. The new endpoint is something an admin would manage conversationally (install, list, edit, toggle, upgrade). Follow the full checklist in .agents/localai-assistant-mcp.md: add a
LocalAIClientinterface method, implement it in bothinprocandhttpapi, register the tool with aTool*constant, update the skill prompts, and add the route totoolToHTTPRouteinpkg/mcp/localaitools/coverage_test.go. - Tool deliberately skipped. The endpoint is internal/diagnostic and adding a chat path would be misleading. Document the decision in the PR description; no code action.
- Forgot. This breaks the contract. The
TestToolHTTPRouteMappingCompletetest inpkg/mcp/localaitoolsis a partial guard (it checks everyTool*has a route mapping), but it does NOT detect new REST endpoints without a tool — that's still a process check on the PR author.
Add to the bottom of the checklist below:
- If admin: decided whether MCP coverage is needed; if yes, tool registered + map updated; if no, skip-reason in PR description.