pii_default_detectors was applied to the live config only by a live
POST /api/settings (ApplyRuntimeSettings) — neither the startup loader nor
the config file watcher read it back. So after a restart the persisted
default detectors were dropped, and the cloud-proxy MITM listener (which
resolves each intercept host's detectors once at start via ResolvePIIPolicy)
came up with an empty set and forwarded intercepted traffic unredacted, even
though the MITM model had pii.enabled:true and the defaults were on disk.
Request-side default redaction broke the same way.
- startup.go: loadRuntimeSettingsFromFile now applies pii_default_detectors,
before startMITMIfConfigured, with env > file precedence.
- config_file_watcher.go: apply pii_default_detectors on live file edits,
matching the existing env-guard pattern used for the other fields.
- settings endpoint: rebuild the MITM listener when pii_default_detectors
changes (its per-host detector map is frozen at listener start), not only
on a mitm_listen change — so toggling a default detector takes effect on
cloud-proxy traffic immediately.
- new LOCALAI_PII_DEFAULT_DETECTORS env var / CLI flag (WithPIIDefaultDetectors)
so the default detector set can be pinned at boot for immutable deployments.
Assisted-by: Claude:claude-opus-4-8 Claude-Code
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
POST /api/settings rebuilt runtime_settings.json from only the request
body, so a focused admin page that submits a single field wiped every
other persisted setting. The Middleware proxy tab (mitm_listen) and
detector table (pii_default_detectors), plus the MCP SetBranding tool
(instance_name/instance_tagline), all POST partial bodies; the
no-omitempty api_keys and pii_default_detectors fields even round-tripped
as JSON null.
Read the persisted settings and overlay only the fields the request set
(RuntimeSettings.MergeNonNil) before writing. Every field is a pointer, so
the reflection-based merge is total over the struct and any field added
later is preserved automatically. Absent or null fields are now kept;
clearing a setting is done by sending its explicit empty/zero value
(api_keys [], mitm_listen "", etc.), unchanged from before. The full
Settings page sends every field, so its Save behaves identically.
Assisted-by: Claude:claude-opus-4-8 Claude-Code
Signed-off-by: Richard Palethorpe <io@richiejp.com>
fix(watchdog): start the live watchdog on a cold enable from Settings (#9125)
The React Settings "Enable Watchdog" master toggle only ever writes the
idle/busy flags; watchdog_enabled is vestigial in that UI. The live
start/stop decision in UpdateSettingsEndpoint keyed off the raw, stale
watchdog_enabled request field, so a cold enable (idle/busy=true,
watchdog_enabled=false) called StopWatchdog() and the watchdog stayed
stopped until the next restart - at which point startup re-derived it
from the idle flag. Net: enabling the watchdog appeared to do nothing.
Derive the run-state from idle||busy as the single source of truth,
mirroring the startup invariant:
- ApplyRuntimeSettings now sets WatchDog = idle||busy whenever either
field is present (so a full disable also brings it down), while an API
client posting only watchdog_enabled keeps its explicit value.
- Add ApplicationConfig.WatchdogShouldRun() mirroring startWatchdog's
gating (idle/busy, LRU eviction, memory reclaimer); the /api/settings
handler uses it to decide start vs stop.
- Belt-and-suspenders: the Settings.jsx master toggle also writes
watchdog_enabled = idle||busy.
Assisted-by: claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Add a routing middleware stack and a cloud-proxy backend.
* cloud-proxy: a Go gRPC backend that forwards OpenAI- and
Anthropic-shaped chat requests to upstream providers, with an
optional translate mode (OpenAI request -> Anthropic /v1/messages
-> OpenAI response) and full tool-calling support.
* routing: admission control, content-aware model routing
(embedding cache + classifier + rerank + Arch-Router score),
PII detection/redaction (regex + NER) with streaming filter and
OpenAI/Anthropic adapters, and a per-user/per-key billing recorder
backed by GORM or in-memory storage.
* middleware: UsageMiddleware records usage via the billing recorder,
plus admission, route-model, usage-stamp and trace middlewares.
* observability: BackendTrace ring buffer stores full request bodies
(capped), MITM proxy emits structured trace events, and router
classifier decisions surface at /api/router/decide.
* gallery: Arch-Router-1.5B (Q4_K_M and Q8_0).
* UI: cloud-proxy model-editor fields, classifier system-prompt and
score-normalization config, and a Traces page rendering request
bodies.
Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Adds a whitelabeling feature so an operator can replace the LocalAI
instance name, tagline, square logo, horizontal logo, and favicon from
the admin Settings page. Defaults fall back to the bundled assets so
existing installs are unaffected.
The public GET /api/branding endpoint is reachable pre-auth so the
login screen can render the configured branding before sign-in.
Mutating routes (POST/DELETE /api/branding/asset/:kind) remain
admin-only. Text fields (instance_name, instance_tagline) ride the
existing /api/settings flow; binary assets get a dedicated multipart
upload route that persists files under DynamicConfigsDir/branding/.
To prevent the Settings page's stale local state from clobbering an
upload on save, UpdateSettingsEndpoint preserves whatever the on-disk
asset filename fields are regardless of the body — /api/branding/asset/*
are the sole writers for those fields.
The MCP catalog gains get_branding and set_branding tools (text fields
only; file upload stays UI-only) plus a configure_branding skill prompt.
While wiring this up, the same restart-loss class of bug surfaced for
several existing fields whose RuntimeSettings entries were never read
by the startup loader. Fix loadRuntimeSettingsFromFile() to load:
- branding (instance_name, instance_tagline, *_file basenames)
- auto_upgrade_backends, prefer_development_backends
- localai_assistant_enabled
- open_responses_store_ttl
- the 7 existing AgentPool fields (enabled, default/embedding model,
chunking sizes, enable_logs, collection_db_path)
Also exposes 3 new AgentPool runtime settings (vector_engine,
database_url, agent_hub_url) via /api/settings + the Settings UI, with
the same load-on-startup wiring. The file watcher's manual-edit path
is intentionally not changed — the in-process API endpoints already
update appConfig directly, so the watcher is redundant for supported
flows and a separate refactor for everything else.
15 TDD specs cover the loader behaviour (1 branding + 11 adjacent + 3
new agent-pool); 2 specs cover the persistence helpers and the
clobber-prevention contract.
Assisted-by: claude-code:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
GET /api/settings returns settings.ApiKeys as the merged env+runtime list
via ApplicationConfig.ToRuntimeSettings(). The WebUI displays that list and
round-trips it back on POST /api/settings unchanged.
UpdateSettingsEndpoint was then doing:
appConfig.ApiKeys = append(envKeys, runtimeKeys...)
where runtimeKeys already contained envKeys (because the UI got them from
the merged GET). Every save therefore duplicated the env keys on top of
the previous merge, and also wrote the duplicates to runtime_settings.json
so the duplication survived restarts and compounded with each save. This
is the user-visible behaviour in #9071: the Web UI shows the keys
twice / three times after consecutive saves.
Before we marshal the settings to disk or call ApplyRuntimeSettings, drop
any incoming key that already appears in startupConfig.ApiKeys. The file
on disk now stores only the genuinely runtime-added keys; the subsequent
append(envKeys, runtimeKeys...) produces one copy of each env key, as
intended. Behaviour is unchanged for users who never had env keys set.
Fixes#9071
Co-authored-by: SAY-5 <SAY-5@users.noreply.github.com>
* feat(gallery): Switch to expandable box instead of pop-over and display model files
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat(ui, backends): Add individual backend logging
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(ui): Set the context settings from the model config
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat: allow to set forcing backends eviction while requests are in flight
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: try to make the request sit and retry if eviction couldn't be done
Otherwise calls that in order to pass would need to shutdown other
backends would just fail.
In this way instead we make the request sit and retry eviction until it
succeeds. The thresholds can be configured by the user.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose settings to CLI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Update docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(loader): refactor single active backend support to LRU
This changeset introduces LRU management of loaded backends. Users can
set now a maximum number of models to be loaded concurrently, and, when
setting LocalAI in single active backend mode we set LRU to 1 for
backward compatibility.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: add tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Update docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(agent): agent jobs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Multiple webhooks, simplify
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Do not use cron with seconds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Create separate pages for details
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Detect if no models have MCP configuration, show wizard
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make services test to run
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): add watchdog settings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Do not re-read env
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Some refactor, move other settings to runtime (p2p)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add API Keys handling
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Allow to disable runtime settings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Documentation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* show MCP toggle in index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop context default
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>