mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-17 04:56:52 -04:00
* feat(react-ui): redesign chat — popover history, focus on send, density pass Replace the persistent 260px conversation sidebar with a Cmd/Ctrl+K popover (ChatsMenu) so the conversation owns the page. Once a chat has at least one message we auto-collapse the global app rail and fade non-essential header chrome; Esc gives the user back the full chrome for the rest of the session. Move Canvas mode and the MCP dropdown into the input wrapper as mode chips — they describe what's armed for the next message and now live where the user composes. The chat header drops to Chats · title · ModelSelector · overflow · settings, and an overflow menu carries admin-only Manage mode along with Info / Edit / Export / Clear. Density pass: tighter header (40px), smaller avatars with the assistant left-border accent doing the work, 88% bubble width, modern field-sizing on the textarea, 32px send/stop buttons. Empty state now surfaces a Recent strip (top 4 non-empty chats) and a Cmd+K hint, replacing the discoverability the persistent sidebar used to provide. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 * feat(react-ui): chat input chips, slimmer menu, focus mode polish Move Canvas mode and the MCP dropdown into the input wrapper as compact mode chips — they describe what's armed for the next message and now sit where the user composes. The MCP popover flips upward when anchored to the input row so it stays on-screen. Eliminate the chat header overflow ("…") menu entirely; relocate each item to its semantic home so users don't have to remember a miscellany drawer: - Manage mode toggle → top of the Settings drawer, alongside the other sticky chat knobs. The shield next to the title still signals state at a glance. - Model info / Edit config → small admin-only "ⓘ" button next to the ModelSelector; the existing model-info panel now hosts the Edit config link. - Export as Markdown → per-row hover action in ChatsMenu, so it works for any chat (not just the active one). - Clear chat history → destructive button at the bottom of the Settings drawer. Make the Sidebar listen to its own `sidebar-collapse` event so the chat's focus mode actually shrinks the rail (it previously only flipped the layout class, leaving the sidebar element at full width and overlapping the chat). Drop the focus-mode toast — the visual shift is enough; the toast was noise. Define `--color-text-tertiary` in both themes; without it metadata text (recent strip timestamps and a few other sites) was inheriting the platform default, which read as black on the dark surface. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 * fix(model/log-store): close merged channel exactly once; clean up Remove Two latent races in BackendLogStore.Subscribe could panic under load (distributed e2e test triggered "send on closed channel" at backend_log_store.go:288): 1. The aggregated path closed the merged channel `ch` from two places — the fan-in waiter goroutine (after all source channels drained) and unsubscribe(). When unsubscribe ran while a fan-in goroutine was mid-flight on `ch <- line`, the close beat the send and the runtime panicked. Now `ch` is closed by exactly one goroutine: the waiter that observes all fan-in goroutines finish. unsubscribe() only closes the per-buffer source channels — the for-range in each fan-in goroutine then exits naturally and the waiter takes care of the merged close. 2. Remove() closed every subscriber channel but didn't delete the entries from the subscribers map, so a concurrent unsubscribe() would call close() again on the already-closed channel ("close of closed channel"). Clear the map entry while closing. Add a regression test that hammers AppendLine concurrently with Subscribe + unsubscribe + Remove; the race detector catches both classes of regression. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 * test(model/log-store): port backend log store tests to ginkgo Bring backend_log_store_test.go in line with the rest of pkg/model (loader_test, watchdog_test, store_test): same external test package (`model_test`), same ginkgo + gomega imports, same Describe/It nesting around the public API. Behaviour is unchanged — the four existing scenarios plus the unsubscribe race regression all run as specs under the existing `TestModel` suite. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
313 lines
8.7 KiB
Go
313 lines
8.7 KiB
Go
package model
|
|
|
|
import (
|
|
"sort"
|
|
"strings"
|
|
"sync"
|
|
"time"
|
|
|
|
"github.com/emirpasic/gods/v2/queues/circularbuffer"
|
|
)
|
|
|
|
// replicaSeparator separates a model ID from the replica index in the
|
|
// supervisor's process key (e.g. "qwen3-0.6b#0"). Mirrored from the
|
|
// worker's buildProcessKey — duplicated as a constant here to keep this
|
|
// package free of CLI imports.
|
|
const replicaSeparator = "#"
|
|
|
|
// BackendLogLine represents a single line of output from a backend process.
|
|
type BackendLogLine struct {
|
|
Timestamp time.Time `json:"timestamp"`
|
|
Stream string `json:"stream"` // "stdout" or "stderr"
|
|
Text string `json:"text"`
|
|
}
|
|
|
|
// backendLogBuffer wraps a circular buffer for a single model's logs
|
|
// and tracks subscribers for real-time streaming.
|
|
type backendLogBuffer struct {
|
|
mu sync.Mutex
|
|
queue *circularbuffer.Queue[BackendLogLine]
|
|
subscribers map[int]chan BackendLogLine
|
|
nextSubID int
|
|
}
|
|
|
|
// BackendLogStore stores per-model backend process output in circular buffers
|
|
// and supports real-time subscriptions for WebSocket streaming.
|
|
type BackendLogStore struct {
|
|
mu sync.RWMutex // protects the buffers map only
|
|
buffers map[string]*backendLogBuffer
|
|
maxLines int
|
|
}
|
|
|
|
// NewBackendLogStore creates a new BackendLogStore with a maximum number of
|
|
// lines retained per model.
|
|
func NewBackendLogStore(maxLinesPerModel int) *BackendLogStore {
|
|
if maxLinesPerModel <= 0 {
|
|
maxLinesPerModel = 1000
|
|
}
|
|
return &BackendLogStore{
|
|
buffers: make(map[string]*backendLogBuffer),
|
|
maxLines: maxLinesPerModel,
|
|
}
|
|
}
|
|
|
|
// getOrCreateBuffer returns the buffer for modelID, creating it if needed.
|
|
func (s *BackendLogStore) getOrCreateBuffer(modelID string) *backendLogBuffer {
|
|
s.mu.RLock()
|
|
buf, ok := s.buffers[modelID]
|
|
s.mu.RUnlock()
|
|
if ok {
|
|
return buf
|
|
}
|
|
|
|
s.mu.Lock()
|
|
buf, ok = s.buffers[modelID]
|
|
if !ok {
|
|
buf = &backendLogBuffer{
|
|
queue: circularbuffer.New[BackendLogLine](s.maxLines),
|
|
subscribers: make(map[int]chan BackendLogLine),
|
|
}
|
|
s.buffers[modelID] = buf
|
|
}
|
|
s.mu.Unlock()
|
|
return buf
|
|
}
|
|
|
|
// AppendLine adds a log line for the given model. The buffer is lazily created.
|
|
// All active subscribers for this model are notified (non-blocking).
|
|
func (s *BackendLogStore) AppendLine(modelID, stream, text string) {
|
|
line := BackendLogLine{
|
|
Timestamp: time.Now(),
|
|
Stream: stream,
|
|
Text: text,
|
|
}
|
|
|
|
buf := s.getOrCreateBuffer(modelID)
|
|
buf.mu.Lock()
|
|
buf.queue.Enqueue(line)
|
|
for _, ch := range buf.subscribers {
|
|
select {
|
|
case ch <- line:
|
|
default:
|
|
}
|
|
}
|
|
buf.mu.Unlock()
|
|
}
|
|
|
|
// GetLines returns a copy of all log lines for a model, or an empty slice.
|
|
//
|
|
// When modelID contains no replica suffix (no `#`), it's treated as a model
|
|
// prefix and the lines from all `modelID#N` replicas are merged in
|
|
// timestamp order. This keeps the existing per-model logs UI working in
|
|
// distributed mode after the worker started using `modelID#replicaIndex`
|
|
// as its process key (multi-replica refactor) — the UI asks for "qwen3-0.6b"
|
|
// and gets the union of all replicas' logs.
|
|
//
|
|
// When modelID contains a `#` (e.g. "qwen3-0.6b#0"), it's treated as an
|
|
// exact process key for per-replica filtering by callers that need it.
|
|
func (s *BackendLogStore) GetLines(modelID string) []BackendLogLine {
|
|
s.mu.RLock()
|
|
exactBuf, exactOK := s.buffers[modelID]
|
|
s.mu.RUnlock()
|
|
|
|
// Exact match — single key. Caller knew the full process key.
|
|
if exactOK {
|
|
exactBuf.mu.Lock()
|
|
lines := exactBuf.queue.Values()
|
|
exactBuf.mu.Unlock()
|
|
return lines
|
|
}
|
|
|
|
// No exact match: aggregate any replicas if modelID looks like a model prefix.
|
|
if strings.Contains(modelID, replicaSeparator) {
|
|
return []BackendLogLine{}
|
|
}
|
|
|
|
prefix := modelID + replicaSeparator
|
|
var matching []*backendLogBuffer
|
|
s.mu.RLock()
|
|
for k, b := range s.buffers {
|
|
if strings.HasPrefix(k, prefix) {
|
|
matching = append(matching, b)
|
|
}
|
|
}
|
|
s.mu.RUnlock()
|
|
|
|
if len(matching) == 0 {
|
|
return []BackendLogLine{}
|
|
}
|
|
|
|
// Merge the per-replica buffers and sort by timestamp so the operator
|
|
// sees a single coherent timeline rather than per-replica blocks.
|
|
var merged []BackendLogLine
|
|
for _, b := range matching {
|
|
b.mu.Lock()
|
|
merged = append(merged, b.queue.Values()...)
|
|
b.mu.Unlock()
|
|
}
|
|
sort.SliceStable(merged, func(i, j int) bool { return merged[i].Timestamp.Before(merged[j].Timestamp) })
|
|
return merged
|
|
}
|
|
|
|
// ListModels returns a sorted list of model IDs that have log buffers.
|
|
// Replica suffixes (`#N`) are stripped and the result is deduplicated, so
|
|
// callers see one entry per loaded model regardless of replica count.
|
|
func (s *BackendLogStore) ListModels() []string {
|
|
s.mu.RLock()
|
|
seen := make(map[string]struct{}, len(s.buffers))
|
|
for id := range s.buffers {
|
|
base := id
|
|
if i := strings.Index(id, replicaSeparator); i >= 0 {
|
|
base = id[:i]
|
|
}
|
|
seen[base] = struct{}{}
|
|
}
|
|
s.mu.RUnlock()
|
|
|
|
models := make([]string, 0, len(seen))
|
|
for id := range seen {
|
|
models = append(models, id)
|
|
}
|
|
sort.Strings(models)
|
|
return models
|
|
}
|
|
|
|
// Clear removes all log lines for a model but keeps the buffer entry.
|
|
func (s *BackendLogStore) Clear(modelID string) {
|
|
s.mu.RLock()
|
|
buf, ok := s.buffers[modelID]
|
|
s.mu.RUnlock()
|
|
if !ok {
|
|
return
|
|
}
|
|
buf.mu.Lock()
|
|
buf.queue.Clear()
|
|
buf.mu.Unlock()
|
|
}
|
|
|
|
// Remove deletes the buffer entry for a model entirely.
|
|
func (s *BackendLogStore) Remove(modelID string) {
|
|
s.mu.Lock()
|
|
if buf, ok := s.buffers[modelID]; ok {
|
|
buf.mu.Lock()
|
|
for id, ch := range buf.subscribers {
|
|
close(ch)
|
|
delete(buf.subscribers, id)
|
|
}
|
|
buf.mu.Unlock()
|
|
delete(s.buffers, modelID)
|
|
}
|
|
s.mu.Unlock()
|
|
}
|
|
|
|
// Subscribe returns a channel that receives new log lines for the given model
|
|
// in real-time, plus an unsubscribe function. The channel has a buffer of 100
|
|
// lines to absorb short bursts without blocking the writer.
|
|
//
|
|
// Like GetLines, a modelID without a `#` separator subscribes to every
|
|
// matching `modelID#N` replica buffer that exists at subscribe time, so the
|
|
// stream merges all replicas. Subscribers are NOT auto-attached to replicas
|
|
// that come up later — callers needing dynamic membership should resubscribe.
|
|
func (s *BackendLogStore) Subscribe(modelID string) (chan BackendLogLine, func()) {
|
|
ch := make(chan BackendLogLine, 100)
|
|
|
|
// Per-replica caller (full process key) — exact subscription.
|
|
if strings.Contains(modelID, replicaSeparator) {
|
|
buf := s.getOrCreateBuffer(modelID)
|
|
buf.mu.Lock()
|
|
id := buf.nextSubID
|
|
buf.nextSubID++
|
|
buf.subscribers[id] = ch
|
|
buf.mu.Unlock()
|
|
unsubscribe := func() {
|
|
buf.mu.Lock()
|
|
if _, exists := buf.subscribers[id]; exists {
|
|
delete(buf.subscribers, id)
|
|
close(ch)
|
|
}
|
|
buf.mu.Unlock()
|
|
}
|
|
return ch, unsubscribe
|
|
}
|
|
|
|
// Aggregated caller: subscribe to the bare-modelID buffer (for back-compat
|
|
// with single-replica writers that still write to the un-suffixed key) AND
|
|
// to every existing `modelID#N` replica buffer. Each per-buffer subscription
|
|
// receives lines into its own channel; we fan them in to `ch` here.
|
|
type subRef struct {
|
|
buf *backendLogBuffer
|
|
id int
|
|
ch chan BackendLogLine
|
|
}
|
|
var refs []subRef
|
|
|
|
subscribe := func(buf *backendLogBuffer) {
|
|
bufCh := make(chan BackendLogLine, 100)
|
|
buf.mu.Lock()
|
|
id := buf.nextSubID
|
|
buf.nextSubID++
|
|
buf.subscribers[id] = bufCh
|
|
buf.mu.Unlock()
|
|
refs = append(refs, subRef{buf: buf, id: id, ch: bufCh})
|
|
}
|
|
|
|
if buf, ok := func() (*backendLogBuffer, bool) {
|
|
s.mu.RLock()
|
|
b, ok := s.buffers[modelID]
|
|
s.mu.RUnlock()
|
|
return b, ok
|
|
}(); ok {
|
|
subscribe(buf)
|
|
}
|
|
|
|
prefix := modelID + replicaSeparator
|
|
s.mu.RLock()
|
|
for k, b := range s.buffers {
|
|
if strings.HasPrefix(k, prefix) {
|
|
subscribe(b)
|
|
}
|
|
}
|
|
s.mu.RUnlock()
|
|
|
|
// Fan-in goroutine: forward every per-buffer channel into the merged
|
|
// channel until all source channels close, then close the merged channel.
|
|
if len(refs) == 0 {
|
|
// No source buffers yet: still return a channel so callers don't crash;
|
|
// it'll close on unsubscribe.
|
|
unsubscribe := func() { close(ch) }
|
|
return ch, unsubscribe
|
|
}
|
|
|
|
var fanWG sync.WaitGroup
|
|
for _, r := range refs {
|
|
fanWG.Add(1)
|
|
go func(c chan BackendLogLine) {
|
|
defer fanWG.Done()
|
|
for line := range c {
|
|
select {
|
|
case ch <- line:
|
|
default: // drop on slow consumer to match non-aggregated behavior
|
|
}
|
|
}
|
|
}(r.ch)
|
|
}
|
|
// `ch` is closed by exactly one goroutine — the one that observes all
|
|
// fan-in goroutines finish. unsubscribe() closes the per-buffer source
|
|
// channels which causes the fan-in loops to exit; the waiter then
|
|
// closes `ch`. Closing `ch` from anywhere else races with `ch <- line`.
|
|
go func() { fanWG.Wait(); close(ch) }()
|
|
|
|
unsubscribe := func() {
|
|
for _, r := range refs {
|
|
r.buf.mu.Lock()
|
|
if c, exists := r.buf.subscribers[r.id]; exists {
|
|
delete(r.buf.subscribers, r.id)
|
|
close(c) // closes the per-buffer source channel; fan-in goroutine exits
|
|
}
|
|
r.buf.mu.Unlock()
|
|
}
|
|
}
|
|
|
|
return ch, unsubscribe
|
|
}
|