Files
LocalAI/core/http/middleware/usage_test.go
LocalAI [bot] f15b9178ec feat(usage): track and visualise usage per API key (#9920)
* feat(usage): add Source, APIKeyID, APIKeyName columns to UsageRecord

Adds three additive columns plus UsageSource* constants. The columns
are auto-migrated by InitDB. APIKeyID is a nullable foreign reference
to UserAPIKey.ID; APIKeyName is snapshotted on each row so revoked
keys keep showing their name in history.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(usage): backfill Source on pre-feature usage rows

InitDB now classifies any pre-existing usage_record with an empty
source: 'legacy-api-key' user -> legacy, everything else -> web.
The backfill is idempotent (only touches NULL/empty rows).

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(usage): add GetUserUsageBySource aggregator

Groups by (bucket, source, api_key_id, api_key_name). Filters out
legacy by default. Returns both per-bucket detail and roll-ups
(by_source, by_key sorted desc and capped at 200, grand_total).

The MAX(created_at) projection is iterated via Rows().Scan into a
string column and parsed manually because the SQLite driver surfaces
the aggregated timestamp as a string, which database/sql refuses to
scan directly into time.Time. Postgres returns a real timestamp; the
same string path handles its RFC3339 form too.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(usage): log Rows() errors and assert LastUsed in tests

Adds rows.Err() and Rows() open-failure logging in
computeSourceTotals so silent data drops surface in logs. Logs on
parseLastUsedString format misses for the same reason. Strengthens
the snapshot-survival test to assert LastUsed is a recent timestamp,
locking the SQLite time-string parser behaviour.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(usage): add admin GetAllUsageBySource with filters and truncation

Optional user_id and api_key_id filters (composed with AND). Legacy
bucket is included for admin callers. truncated=true when more than
200 distinct keys would be in the by_key roll-up.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(auth): plumb auth_source and auth_apikey through Echo context

tryAuthenticate now sets auth_source on every successful branch
(web for session/Bearer-session, apikey for Bearer-key/x-api-key/
token-cookie, legacy for legacy env key match). For named-key
branches it also stores the resolved *UserAPIKey under auth_apikey
so downstream middlewares can snapshot id+name without re-validating.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(auth): expand tryAuthenticate godoc and cover Bearer-session branch

Documents all three context-keys side effects (auth_source,
auth_apikey, _auth_session) plus the split of responsibilities with
the parent Middleware. Adds a test for the Bearer-as-session-token
classification so future regressions there fail loudly.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(usage): UsageMiddleware records source + snapshots key name

Reads auth_source and auth_apikey from the Echo context (set by
auth.Middleware in the previous task). Snapshots UserAPIKey.ID and
Name onto each row so revoked keys remain readable in history.
Falls back to source=web when no auth_source is set (auth disabled
or unrecognised path).

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(usage): add /api/auth/usage/sources and admin variant

Self endpoint filters legacy server-side; admin endpoint includes
legacy and accepts user_id + api_key_id filters. Response includes
buckets, totals.{by_source, by_key, grand_total}, and a truncated
flag set when the per-key roll-up was capped at 200.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* docs(routes): mark test mirror handlers as keep-in-sync with production

The newTestAuthApp helper duplicates production route handlers
inline because it cannot use RegisterAuthRoutes (which requires a
*application.Application). Naming the source path on each mirror
makes the drift contract explicit for future maintainers.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(ui): add usageApi.getMySources/getAdminSources + i18n strings

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(ui): add Sources tab skeleton with data fetch

Adds Usage page tab that fetches /api/auth/usage/sources (or the
admin variant). Renders raw totals plus a placeholder key list;
real visualisations land in subsequent commits. Restructures the
existing tab button block so Models and Sources are visible to
non-admins (Users remains admin-only).

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(ui): source mix ribbon + searchable/sortable sources table

Replaces the SourcesTab placeholder rendering with two reusable
components: SourceMixRibbon (one segmented bar per source class)
and SourcesTable (search + sort + revoked-key dim). Pulls the
current API key list to detect revoked keys.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(ui): skip revoked-key detection until the key list is known

existingKeyIds defaulted to an empty Set, which made every live
api_key row render as (revoked) during the brief window before
apiKeysApi.list() resolved, and permanently after a fetch failure.
Use null as the unknown state and suppress the revoked badge until
the parent provides a real Set.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(ui): top-N stacked time chart and drill-in chip for Sources tab

Top 7 sources by total tokens get distinct colours; the rest roll up
into 'Other'. Clicking a row in the SourcesTable dims everything
except that series in the chart; the chip is the canonical clear.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* docs(usage): document per-API-key Sources tab and endpoints

Extends features/authentication.md Usage Tracking section with:
- A 'Sources' tab description and source-class taxonomy
- Endpoint documentation for /api/auth/usage/sources and the
  admin variant
- Response shape example with by_source / by_key / grand_total
- Migration note about pre-feature row backfill

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(usage): silence errcheck on deferred rows.Close

CI errcheck flagged the bare 'defer rows.Close()' in
computeSourceTotals. Wrap in a closure that discards the close
error explicitly; an error here is non-actionable since we have
already drained the rows and logged any iteration failure.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor(usage): bound batcher intake and add Shutdown/FlushNow hooks

The pre-existing usage batcher had no cap on its add() path; the
usageMaxPending=5000 constant only guarded the re-queue path after
a failed write, leaving memory growth unbounded if the DB fell
behind. This commit:

- Adds the cap to add() so saturation drops new records (rate-limited
  warn at 1/1024) instead of growing unbounded.
- Raises usageMaxPending to 50000 to absorb realistic inference bursts.
- Replaces the package-level batcher global with a mutex-guarded pair
  plus a currentBatcher() accessor so Init / Shutdown cycles are
  race-free.
- Adds ShutdownUsageRecorder() for graceful drain on process exit
  (not yet wired into app shutdown, just published).
- Adds FlushNow() for deterministic tests; the middleware suite no
  longer needs 6s sleeps per spec and now runs in ~50ms instead of 18s.
- Re-queue on failed flush is now cap-aware: prepends as much of the
  failed batch as fits alongside concurrent arrivals, instead of
  dropping the whole batch when full.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(usage): drain usage batcher on graceful shutdown

Registers ShutdownUsageRecorder with the existing
signals.RegisterGracefulTerminationHandler so SIGINT/SIGTERM
synchronously flushes any in-memory usage records before the
process exits. Without this, up to one flush interval (5s) of
recorded usage was lost when LocalAI restarted.

Refs: #9862
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-21 16:34:02 +02:00

141 lines
4.2 KiB
Go

//go:build auth
package middleware_test
import (
"bytes"
"encoding/json"
"net/http"
"net/http/httptest"
"github.com/labstack/echo/v4"
"github.com/mudler/LocalAI/core/http/auth"
"github.com/mudler/LocalAI/core/http/middleware"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"gorm.io/gorm"
)
// testAuthDB returns a fresh in-memory SQLite auth DB.
func testAuthDB() *gorm.DB {
db, err := auth.InitDB(":memory:")
if err != nil {
panic(err)
}
return db
}
var _ = Describe("UsageMiddleware", func() {
var (
e *echo.Echo
db *gorm.DB
)
BeforeEach(func() {
db = testAuthDB()
e = echo.New()
middleware.InitUsageRecorder(db)
})
AfterEach(func() {
middleware.ShutdownUsageRecorder()
})
okHandler := func(c echo.Context) error {
body, _ := json.Marshal(map[string]any{
"model": "gpt-4",
"usage": map[string]int{
"prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15,
},
})
c.Response().Header().Set("Content-Type", "application/json")
c.Response().WriteHeader(http.StatusOK)
_, _ = c.Response().Write(body)
return nil
}
// FlushNow drains pending records synchronously, replacing the 6s sleep
// that was previously needed to wait for the batcher's ticker.
flush := middleware.FlushNow
It("records source=web when auth_source is web", func() {
e.POST("/v1/chat/completions", okHandler, func(next echo.HandlerFunc) echo.HandlerFunc {
return func(c echo.Context) error {
c.Set("auth_user", &auth.User{ID: "alice", Name: "Alice"})
c.Set("auth_source", auth.UsageSourceWeb)
return next(c)
}
}, middleware.UsageMiddleware(db))
req := httptest.NewRequest("POST", "/v1/chat/completions", bytes.NewReader([]byte(`{}`)))
e.ServeHTTP(httptest.NewRecorder(), req)
flush()
var rec auth.UsageRecord
Expect(db.Where("user_id = ?", "alice").First(&rec).Error).To(Succeed())
Expect(rec.Source).To(Equal(auth.UsageSourceWeb))
Expect(rec.APIKeyID).To(BeNil())
Expect(rec.APIKeyName).To(BeEmpty())
})
It("records source=apikey with snapshotted name when auth_apikey is set", func() {
e.POST("/v1/chat/completions", okHandler, func(next echo.HandlerFunc) echo.HandlerFunc {
return func(c echo.Context) error {
c.Set("auth_user", &auth.User{ID: "alice", Name: "Alice"})
c.Set("auth_source", auth.UsageSourceAPIKey)
c.Set("auth_apikey", &auth.UserAPIKey{ID: "key-1", Name: "ci-runner"})
return next(c)
}
}, middleware.UsageMiddleware(db))
req := httptest.NewRequest("POST", "/v1/chat/completions", bytes.NewReader([]byte(`{}`)))
e.ServeHTTP(httptest.NewRecorder(), req)
flush()
var rec auth.UsageRecord
Expect(db.Where("user_id = ?", "alice").First(&rec).Error).To(Succeed())
Expect(rec.Source).To(Equal(auth.UsageSourceAPIKey))
Expect(rec.APIKeyID).ToNot(BeNil())
Expect(*rec.APIKeyID).To(Equal("key-1"))
Expect(rec.APIKeyName).To(Equal("ci-runner"))
})
It("FlushNow drains pending records synchronously", func() {
e.POST("/v1/chat/completions", okHandler, func(next echo.HandlerFunc) echo.HandlerFunc {
return func(c echo.Context) error {
c.Set("auth_user", &auth.User{ID: "carol", Name: "Carol"})
c.Set("auth_source", auth.UsageSourceWeb)
return next(c)
}
}, middleware.UsageMiddleware(db))
req := httptest.NewRequest("POST", "/v1/chat/completions", bytes.NewReader([]byte(`{}`)))
e.ServeHTTP(httptest.NewRecorder(), req)
// No sleep: FlushNow should drain immediately.
middleware.FlushNow()
var rec auth.UsageRecord
Expect(db.Where("user_id = ?", "carol").First(&rec).Error).To(Succeed())
Expect(rec.Source).To(Equal(auth.UsageSourceWeb))
})
It("falls back to source=web when auth_source is empty", func() {
e.POST("/v1/chat/completions", okHandler, func(next echo.HandlerFunc) echo.HandlerFunc {
return func(c echo.Context) error {
c.Set("auth_user", &auth.User{ID: "alice", Name: "Alice"})
// no auth_source set
return next(c)
}
}, middleware.UsageMiddleware(db))
req := httptest.NewRequest("POST", "/v1/chat/completions", bytes.NewReader([]byte(`{}`)))
e.ServeHTTP(httptest.NewRecorder(), req)
flush()
var rec auth.UsageRecord
Expect(db.Where("user_id = ?", "alice").First(&rec).Error).To(Succeed())
Expect(rec.Source).To(Equal(auth.UsageSourceWeb))
})
})