mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-27 18:06:58 -04:00
* feat(distributed): add SyncedMap cross-replica in-memory state component Introduce core/services/syncstate.SyncedMap[K,V]: a thread-safe in-memory map that keeps itself consistent across frontend replicas via NATS, with an optional pluggable durable Store and hydrate-from-source convergence. Several features keep process-local state surfaced to the API (finetune/quant jobs, agent tasks, model configs) and each hand-wired the same in-memory + NATS broadcast + read-through-store legs - or forgot to, reintroducing cross-replica staleness. SyncedMap makes that consistency a configuration choice: - local writes mutate the map, write through the Store, then broadcast a delta; - the apply path is memory-only and never re-publishes or re-writes the Store (structural echo-loop guard, mirroring galleryop.mergeStatus); - on Start and on NATS reconnect the map re-hydrates from the source (Store, else Loader); an optional periodic Reconcile repairs silent drift; - standalone mode (nil NATS client) is a strict in-memory no-op. Reconnect re-hydrate is wired via a new *messaging.Client.OnReconnect callback, consumed through an optional type-assertion so MessagingClient stays minimal. Adds messaging.SubjectSyncStateDelta and a reusable testutil.FakeBus (synchronous in-process MessagingClient with wildcard matching) for adopter tests. Component only; service migrations follow in subsequent commits. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * refactor(finetune): back jobs with SyncedMap for cross-replica consistency FineTuneService kept jobs in a process-local map and, although it wrote them to Postgres, ListJobs/GetJob never read the store back and the wired natsClient was never used - so in distributed mode a job created on one replica was invisible to the others. Replace the map and the dead client with a syncstate.SyncedMap keyed by job ID, value *schema.FineTuneJob (the exact REST shape, so responses are unchanged). - Add a Store adapter (core/services/finetune/syncstore.go) over FineTuneStore, plus FineTuneStore.ListAll (global hydrate; per-user List kept) and an idempotent Upsert (create-or-update; Create alone fails on dup key). - Writes go through SyncedMap.Set/Delete (write-through + broadcast); reads use List/Get. The on-disk state.json path becomes the standalone Loader, keeping single-node restart recovery (stale->stopped / exporting->failed fixups). - Fold SetNATSClient/SetFineTuneStore into NewFineTuneService; app.go passes the distributed NATS client + store when distributed, nil otherwise. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * refactor(agentpool): back agent tasks with SyncedMap for cross-replica consistency AgentJobService.ListTasks read the process-local tasks map only, while ListJobs already read through the DB persister + dispatcher NATS - so in distributed mode a task created on one replica was invisible to the others. Back tasks with a syncstate.SyncedMap keyed by task ID (value schema.Task, the exact REST shape); jobs are left untouched. - Store adapter (task_syncstore.go) over the existing JobPersister (LoadTasks/SaveTask/DeleteTask); reads svc.persister/userID live so a persister swap needs no rebuild. No new persister methods required. - Task reads -> SyncedMap.List/Get; create/update -> Set (write-through + broadcast); delete -> Delete. The file persister now owns its own task set so the write-through path does not re-enter the SyncedMap lock (deadlock guard). - The distributed NATS client is not available at construction (start() precedes initDistributed), so it is injected via SetTaskSyncNATS, which rebuilds the still-empty map before Start/hydrate. Wired at the main, restart, and per-user (UserServicesManager) distributed sites. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * refactor(quantization): back jobs with SyncedMap + durable QuantStore QuantizationService kept jobs in a process-local map persisted only to a local state.json, so in distributed mode jobs were neither visible across replicas nor durable cluster-wide. Back jobs with a syncstate.SyncedMap keyed by job ID (value *schema.QuantizationJob, the exact REST shape). - New distributed.QuantStore (GORM, table quantization_jobs) mirroring FineTuneStore: Create/Get/ListAll/Upsert(idempotent)/Delete, registered for AutoMigrate via distributed.InitStores (Stores.Quant). - New adapter (quantization/syncstore.go) over QuantStore implementing syncstate.Store, with record<->schema conversion. - Reads go through List/Get, writes through Set/Delete (write-through + broadcast); state.json is kept as the standalone Loader for single-node restart recovery (stale-job fixups preserved). - app.go passes the distributed NATS client + QuantStore when distributed, nil otherwise; Start/Close lifecycle mirrors finetune. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * fix(syncstate): annotate gosec G118 false positive on lifeCtx gosec flagged the WithCancel in Start as "cancellation function not called" because the returned cancel is stored on the struct rather than called/deferred in scope. It is invoked in Close (covered by tests), and lifeCtx must outlive Start to drive the reconnect/reconcile goroutines. Suppress the verified false positive with a justified #nosec G118. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * test(distributed): e2e two-replica SyncedMap sync over real NATS + Postgres Adds the real-infrastructure counterpart to the fake-bus unit tests, in the existing distributed e2e suite (testcontainers NATS + PostgreSQL). Two SyncedMap instances stand in for two frontend replicas - each with its OWN NATS connection to a shared server and a SHARED Postgres store (the distributed-mode invariant) - and assert, over the wire: - a create on replica A is observed by replica B; - an update and a delete propagate A -> B (delete prunes, which a reload cannot); - a late-joining replica recovers a job it never received a delta for, via store hydrate on Start (the at-most-once gap a fake bus cannot exercise); - a local Set is written through to the shared Postgres store. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
188 lines
7.4 KiB
Go
188 lines
7.4 KiB
Go
package quantization
|
|
|
|
// White-box tests (package quantization) so a spec can drive the service's
|
|
// internal SyncedMap the same way StartJob does (via jobs.Set) without standing
|
|
// up a quantization backend, then assert the cross-replica reads
|
|
// (GetJob/ListJobs) and the adapter conversions that keep REST responses
|
|
// byte-for-byte unchanged.
|
|
|
|
import (
|
|
"context"
|
|
|
|
. "github.com/onsi/ginkgo/v2"
|
|
. "github.com/onsi/gomega"
|
|
|
|
"github.com/mudler/LocalAI/core/config"
|
|
"github.com/mudler/LocalAI/core/schema"
|
|
"github.com/mudler/LocalAI/core/services/distributed"
|
|
"github.com/mudler/LocalAI/core/services/testutil"
|
|
)
|
|
|
|
// newTestService builds a standalone QuantizationService wired to the given bus.
|
|
// The model/config loaders are nil because the read/sync paths under test never
|
|
// touch them; the data dir is a throwaway temp dir so the disk Loader finds
|
|
// nothing.
|
|
func newTestService(bus *testutil.FakeBus) *QuantizationService {
|
|
appConfig := &config.ApplicationConfig{
|
|
Context: context.Background(),
|
|
DataPath: GinkgoT().TempDir(),
|
|
}
|
|
return NewQuantizationService(appConfig, nil, nil, bus, nil)
|
|
}
|
|
|
|
var _ = Describe("QuantizationService", func() {
|
|
ctx := context.Background()
|
|
|
|
Describe("cross-replica job visibility", func() {
|
|
var (
|
|
bus *testutil.FakeBus
|
|
a, b *QuantizationService
|
|
)
|
|
|
|
BeforeEach(func() {
|
|
// One shared bus, two replicas: exactly the distributed topology where a
|
|
// round-robin request may land on a replica that did not originate the
|
|
// change.
|
|
bus = testutil.NewFakeBus()
|
|
a = newTestService(bus)
|
|
b = newTestService(bus)
|
|
})
|
|
|
|
AfterEach(func() {
|
|
Expect(a.Close()).To(Succeed())
|
|
Expect(b.Close()).To(Succeed())
|
|
})
|
|
|
|
It("makes a job created on A visible via B's GetJob and ListJobs", func() {
|
|
job := &schema.QuantizationJob{ID: "job-1", UserID: "user-1", Status: "queued", CreatedAt: "2026-06-27T10:00:00Z"}
|
|
// StartJob persists via jobs.Set; drive that directly to avoid a backend.
|
|
Expect(a.jobs.Set(ctx, job)).To(Succeed())
|
|
|
|
got, err := b.GetJob("user-1", "job-1")
|
|
Expect(err).ToNot(HaveOccurred(), "B must see a job A just created")
|
|
Expect(got.Status).To(Equal("queued"))
|
|
|
|
listed := b.ListJobs("user-1")
|
|
Expect(listed).To(HaveLen(1))
|
|
Expect(listed[0].ID).To(Equal("job-1"))
|
|
})
|
|
|
|
It("removes a job from B when it is deleted on A", func() {
|
|
job := &schema.QuantizationJob{ID: "job-2", UserID: "user-1", Status: "completed", CreatedAt: "2026-06-27T10:00:00Z"}
|
|
Expect(a.jobs.Set(ctx, job)).To(Succeed())
|
|
_, err := b.GetJob("user-1", "job-2")
|
|
Expect(err).ToNot(HaveOccurred(), "precondition: B must have the job before the delete")
|
|
|
|
Expect(a.jobs.Delete(ctx, "job-2")).To(Succeed())
|
|
|
|
_, err = b.GetJob("user-1", "job-2")
|
|
Expect(err).To(HaveOccurred(), "a delete on A must remove the job from B")
|
|
})
|
|
|
|
It("propagates a status update from A to B", func() {
|
|
job := &schema.QuantizationJob{ID: "job-3", UserID: "user-1", Status: "quantizing", CreatedAt: "2026-06-27T10:00:00Z"}
|
|
Expect(a.jobs.Set(ctx, job)).To(Succeed())
|
|
|
|
updated := &schema.QuantizationJob{ID: "job-3", UserID: "user-1", Status: "completed", CreatedAt: "2026-06-27T10:00:00Z"}
|
|
Expect(a.jobs.Set(ctx, updated)).To(Succeed())
|
|
|
|
got, err := b.GetJob("user-1", "job-3")
|
|
Expect(err).ToNot(HaveOccurred())
|
|
Expect(got.Status).To(Equal("completed"))
|
|
})
|
|
})
|
|
|
|
Describe("ListJobs", func() {
|
|
var svc *QuantizationService
|
|
|
|
BeforeEach(func() {
|
|
svc = newTestService(testutil.NewFakeBus())
|
|
})
|
|
AfterEach(func() { Expect(svc.Close()).To(Succeed()) })
|
|
|
|
It("filters by user and sorts newest-first", func() {
|
|
Expect(svc.jobs.Set(ctx, &schema.QuantizationJob{ID: "old", UserID: "u1", CreatedAt: "2026-06-25T10:00:00Z"})).To(Succeed())
|
|
Expect(svc.jobs.Set(ctx, &schema.QuantizationJob{ID: "new", UserID: "u1", CreatedAt: "2026-06-27T10:00:00Z"})).To(Succeed())
|
|
Expect(svc.jobs.Set(ctx, &schema.QuantizationJob{ID: "other", UserID: "u2", CreatedAt: "2026-06-26T10:00:00Z"})).To(Succeed())
|
|
|
|
jobs := svc.ListJobs("u1")
|
|
Expect(jobs).To(HaveLen(2), "only u1's jobs")
|
|
Expect(jobs[0].ID).To(Equal("new"), "newest first")
|
|
Expect(jobs[1].ID).To(Equal("old"))
|
|
})
|
|
|
|
It("returns every user's jobs when the userID filter is empty", func() {
|
|
Expect(svc.jobs.Set(ctx, &schema.QuantizationJob{ID: "a", UserID: "u1", CreatedAt: "2026-06-25T10:00:00Z"})).To(Succeed())
|
|
Expect(svc.jobs.Set(ctx, &schema.QuantizationJob{ID: "b", UserID: "u2", CreatedAt: "2026-06-26T10:00:00Z"})).To(Succeed())
|
|
|
|
Expect(svc.ListJobs("")).To(HaveLen(2))
|
|
})
|
|
|
|
It("rejects GetJob for a job owned by another user", func() {
|
|
Expect(svc.jobs.Set(ctx, &schema.QuantizationJob{ID: "x", UserID: "owner", CreatedAt: "2026-06-25T10:00:00Z"})).To(Succeed())
|
|
|
|
_, err := svc.GetJob("intruder", "x")
|
|
Expect(err).To(HaveOccurred(), "a different user must not read someone else's job")
|
|
})
|
|
})
|
|
|
|
Describe("store adapter conversion", func() {
|
|
// The SyncedMap value type is *schema.QuantizationJob (the exact REST shape).
|
|
// These specs prove the DB adapter round-trips it losslessly, so hydrate and
|
|
// write-through in distributed mode keep responses unchanged.
|
|
It("round-trips a job through jobToRecord/recordToJob preserving the API shape", func() {
|
|
original := &schema.QuantizationJob{
|
|
ID: "rt-1",
|
|
UserID: "user-1",
|
|
Model: "base-model",
|
|
Backend: "llama-cpp-quantization",
|
|
ModelID: "llama-cpp-quantization-quantize-rt-1",
|
|
QuantizationType: "q4_k_m",
|
|
Status: "completed",
|
|
Message: "done",
|
|
OutputDir: "/data/quantization/rt-1",
|
|
OutputFile: "/data/quantization/rt-1/model.gguf",
|
|
ExtraOptions: map[string]string{"hf_token": "secret"},
|
|
CreatedAt: "2026-06-27T10:00:00Z",
|
|
ImportStatus: "completed",
|
|
ImportMessage: "",
|
|
ImportModelName: "base-model-q4_k_m-rt-1",
|
|
Config: &schema.QuantizationJobRequest{Model: "base-model", Backend: "llama-cpp-quantization", QuantizationType: "q4_k_m"},
|
|
}
|
|
|
|
rec := jobToRecord(original)
|
|
Expect(rec.ID).To(Equal("rt-1"))
|
|
Expect(rec.ConfigJSON).ToNot(BeEmpty(), "structured config must serialize into the JSON column")
|
|
Expect(rec.ExtraOptsJSON).ToNot(BeEmpty())
|
|
|
|
back := recordToJob(rec)
|
|
Expect(back.ID).To(Equal(original.ID))
|
|
Expect(back.UserID).To(Equal(original.UserID))
|
|
Expect(back.Model).To(Equal(original.Model))
|
|
Expect(back.Backend).To(Equal(original.Backend))
|
|
Expect(back.ModelID).To(Equal(original.ModelID))
|
|
Expect(back.QuantizationType).To(Equal(original.QuantizationType))
|
|
Expect(back.Status).To(Equal(original.Status))
|
|
Expect(back.Message).To(Equal(original.Message))
|
|
Expect(back.OutputDir).To(Equal(original.OutputDir))
|
|
Expect(back.OutputFile).To(Equal(original.OutputFile))
|
|
Expect(back.ImportStatus).To(Equal(original.ImportStatus))
|
|
Expect(back.ImportModelName).To(Equal(original.ImportModelName))
|
|
Expect(back.CreatedAt).To(Equal(original.CreatedAt))
|
|
Expect(back.ExtraOptions).To(Equal(original.ExtraOptions))
|
|
Expect(back.Config).ToNot(BeNil())
|
|
Expect(back.Config.QuantizationType).To(Equal("q4_k_m"))
|
|
})
|
|
})
|
|
|
|
Describe("compile-time adapter contract", func() {
|
|
It("satisfies syncstate.Store for *distributed.QuantStore", func() {
|
|
// Guards against drift between the adapter and the component interface;
|
|
// the var assertion in syncstore.go covers it at build time, this keeps
|
|
// the type referenced from a spec too.
|
|
var _ *distributed.QuantStore
|
|
Expect(&quantStoreAdapter{}).ToNot(BeNil())
|
|
})
|
|
})
|
|
})
|