* fix(auth): cascade user deletion across all owned data on PostgreSQL
Deleting a user from the admin UI in distributed mode (PostgreSQL auth
DB) returned "user not found" even when the user clearly existed. The
old handler ignored result.Error and only checked RowsAffected, so a
foreign-key constraint violation surfaced as a misleading 404.
Two issues drove this:
1. invite_codes.created_by / used_by reference users(id) but the
InviteCode model declared the FKs without ON DELETE CASCADE. On
PostgreSQL the engine therefore rejected the user delete with NO
ACTION whenever the user had ever issued or consumed an invite. On
SQLite (default in single-node mode) FKs are not enforced, so the
bug never appeared there.
2. Several owned tables were never cleaned up regardless of dialect:
user_permissions and quota_rules relied on CASCADE that does not
fire under SQLite, and usage_records have no FK at all and were
left orphaned in every dialect.
Introduce auth.DeleteUserCascade which runs the full cleanup in a
single transaction: drop invites authored by the user, NULL used_by on
invites they consumed (preserves the audit trail), and explicitly wipe
sessions, API keys, permissions, quota rules, and usage metrics before
deleting the user. The in-memory quota cache is invalidated after
commit so a recreated user with the same id never sees stale entries.
The HTTP handler now maps the helper's errors to proper status codes —
real failures surface as 500 with the cause instead of being swallowed
as "not found".
Add Ginkgo regression coverage in core/http/auth/users_test.go and
core/http/routes/auth_test.go covering invite cleanup, used_by
null-out, full data wipe, and the FK-enforced original failure mode
(via PRAGMA foreign_keys=ON to mirror PostgreSQL behavior on SQLite).
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]
* chore(deps): bump LocalAGI/LocalRecall — pull in go-fitz PDF extraction
Pulls LocalAGI@main (facd888) and LocalRecall@v0.6.0. The latter
swaps PDF text extraction from dslipak/pdf to gen2brain/go-fitz
(libmupdf bindings) and wraps it in a 60s goroutine timeout —
previously certain PDFs (broken xref tables, encrypted, image-only
without OCR) would hang indefinitely inside r.GetPlainText() and
poison the upload queue.
Pure dep bump, no LocalAI source changes. Indirect graph picks up
go-fitz + purego + ffi; drops dslipak/pdf.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update quic-go from v0.54.1 to v0.59.0 to fix the crypto/tls session
ticket panic described in quic-go/quic-go#5572.
Co-dependency go-libp2p upgraded from v0.43.0 to v0.48.0 (required for
quic-go v0.59.0 compatibility).
Signed-off-by: Beshoy Girgis <shoy@1ds.us>
* fix(nodes/health): skip stale-marking already-offline nodes
The health monitor re-emitted "Node heartbeat stale" + "Marking stale
node offline" + MarkOffline on every cycle for nodes that were already
in the offline (or unhealthy) state. For an operator-stopped node this
flooded the logs with the same WARN+INFO pair every check interval.
Skip the staleness branch when the node is already StatusOffline /
StatusUnhealthy — the state is already what we'd write, so neither the
log lines nor the DB update carry information.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(worker): wait for backend gRPC bind before replying to backend.install
The backend supervisor used to wait up to 4s (20 × 200ms) for the
backend's gRPC server to answer a HealthCheck, then log a warning and
reply Success with the bind address anyway. On slower nodes (a Jetson
Orin doing first-boot CUDA init, large CGO library load) the gRPC
listener wasn't up yet, so the frontend's first LoadModel dial returned
"connect: connection refused" and the operator chased a phantom network
issue instead of a startup-timing one.
Two changes:
- Bump the readiness window to 30s. CUDA init on Orin/Thor first boot
measures in seconds, not milliseconds.
- On deadline-exceeded, stop the half-started process, recycle the
port, and return an error with the backend's stderr tail. The
frontend now gets a real failure with diagnostic context instead of
a misleading ECONNREFUSED on a downstream dial.
Process death during the wait window keeps its existing fast-fail path.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(distributed): route auto-upgrade through BackendManager + bump LocalAGI/LocalRecall
Two distributed-mode bugs that surfaced together in the orchestrator
logs:
1. Auto-upgrade always failed with "backend not found".
UpgradeChecker correctly routed CheckUpgrades through the active
BackendManager (so the frontend aggregates worker state), but the
auto-upgrade branch right below called gallery.UpgradeBackend
directly with the frontend's SystemState. In distributed mode the
frontend has no backends installed locally, so ListSystemBackends
returned empty and Get(name) failed for every reported upgrade.
Auto-upgrade now also goes through BackendManager.UpgradeBackend,
which fans out to workers via NATS.
2. Embedding-load failure on a remote node crashed the orchestrator.
When RAG init lazily called NewPersistentPostgresCollection and the
remote embedding worker was unreachable, LocalRecall called
os.Exit(1) inside the constructor, killing the orchestrator pod.
LocalRecall now returns errors instead, LocalAGI surfaces them as a
nil collection, and the existing RAGProviderFromState path returns
(nil, nil, false) — the same code path the agent pool already takes
when no RAG is configured. The orchestrator stays up; chat requests
degrade to "no RAG available" until the embedding worker recovers.
Bumps:
github.com/mudler/LocalAGI → e83bf515d010
github.com/mudler/localrecall → 6138c1f535ab
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Bumps LocalAGI to pick up the LocalRecall postgres backend fix that
resizes the pgvector column when the configured embedding model
returns vectors of a different dimensionality than the existing
collection. Switching the agent pool's embedding model now triggers
a transparent re-embed at startup instead of failing every subsequent
upload with 'expected N dimensions, not M' (SQLSTATE 22000).
Also surfaces a 409 with an actionable message in
UploadToCollectionEndpoint as a safety net for the rare cases the
upstream migration path doesn't cover (e.g. a model swapped at
runtime), instead of the previous opaque 500.
* feat: add distributed mode (experimental)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix data races, mutexes, transactions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix events and tool stream in agent chat
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* use ginkgo
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(cron): compute correctly time boundaries avoiding re-triggering
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* enhancements, refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not flood of healthy checks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not list obvious backends as text backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* tests fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop redundant healthcheck
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* enhancements, refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: add fine-tuning endpoint
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(experimental): add fine-tuning endpoint and TRL support
This changeset defines new GRPC signatues for Fine tuning backends, and
add TRL backend as initial fine-tuning engine. This implementation also
supports exporting to GGUF and automatically importing it to LocalAI
after fine-tuning.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* commit TRL backend, stop by killing process
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* move fine-tune to generic features
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add evals, reorder menu
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): add users and authentication support
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: allow the admin user to impersonificate users
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: ui improvements, disable 'Users' button in navbar when no auth is configured
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: add OIDC support
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: gate models
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: cache requests to optimize speed
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* small UI enhancements
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ui): style improvements
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: cover other paths by auth
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: separate local auth, refactor
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* security hardening, approval mode
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: fix tests and expectations
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: update localagi/localrecall
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(realtime): WebRTC support
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(tracing): Show full LLM opts and deltas
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat: add standalone and agentic functionalities
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose agents via responses api
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>