Merge remote-tracking branch 'origin/master' into worktree-clusterrouting-phase2
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@@ -38,9 +38,12 @@ The React UI (`core/http/react-ui/`) has **no component/unit tests** — its onl
|
||||
- **Browser:** the flake dev shell ships `chromium` and exports `PLAYWRIGHT_CHROMIUM_PATH`; `playwright.config.js` uses it via `launchOptions.executablePath`, and the Makefile skips `playwright install` when it's set. This avoids Playwright's downloaded browser, which can't resolve system libs (`libglib-2.0`, …) on NixOS. In CI (no `PLAYWRIGHT_CHROMIUM_PATH`) the Makefile falls back to `playwright install --with-deps chromium`.
|
||||
- The app is a React SPA, so coverage accumulates across in-app navigation within a test; a full `page.goto`/reload resets it.
|
||||
- `.nycrc.json` uses `all: true`, so **every `src/**` file is in the report**, including 0%-coverage ones — that's how you spot features with no test at all (sort the HTML report or `coverage-summary.json` by line% ascending).
|
||||
- **UI coverage gate:** `make test-ui-coverage-check` runs the suite then `scripts/ui-coverage-check.sh`, failing if total line coverage drops more than `UI_COVERAGE_TOLERANCE` (default **1.0pp**) below `core/http/react-ui/coverage-baseline.txt`. `make test-ui-coverage-baseline` regenerates the baseline. **Why a tolerance (unlike the strict Go gate):** UI e2e line coverage is *non-deterministic* — async/debounced paths (e.g. the VRAM estimate's 500ms debounce) make identical specs vary ~0.5pp run-to-run, so a zero-tolerance gate would flake. Keep the tolerance just above the observed jitter. Run in CI (`tests-ui-e2e.yml`) and pre-commit on `core/http/react-ui/` changes.
|
||||
- **UI coverage gate:** `make test-ui-coverage-check` runs the suite then `scripts/ui-coverage-check.sh`, failing if total line coverage drops more than `UI_COVERAGE_TOLERANCE` below `core/http/react-ui/coverage-baseline.txt`. `make test-ui-coverage-baseline` regenerates the baseline. Runs in CI (`tests-ui-e2e.yml`) and pre-commit on `core/http/react-ui/` changes.
|
||||
- **Why it has a tolerance (unlike the strict Go gate):** UI e2e coverage is *non-deterministic*. Specs that assert on state and end while async/lazy render work is still in flight collect those lines only when the render beats the coverage teardown — so the total drifts with machine speed/load (a fast local box reads higher than a slow CI runner), diffusely across many specs. The tolerance absorbs that drift, so set the baseline *below* the slow-CI floor, never to a fast-local `make test-ui-coverage-baseline` number, or CI flaps.
|
||||
- **Raising coverage is cheap:** a *render-smoke* spec (navigate to a route, assert its header renders) mounts a lazy page and runs its full render + initial effects, capturing most of its lines in a few lines of test — see `e2e/page-render-smoke.spec.js`. Auth is disabled in the test server (`isAdmin=true`), so `RequireAdmin`/`RequireFeature` routes render without a mock. The most *deterministic* win is removing a race: make a spec `await` a rendered element before ending (see `e2e/agents.spec.js` → AgentCreate) so its lines count every run.
|
||||
|
||||
Rules:
|
||||
- The gate is **strict — there is no tolerance**. Any decrease fails, regardless of how many lines a PR adds or deletes. `covermode=atomic` makes line coverage deterministic, so there's no run-to-run jitter to excuse.
|
||||
- When a change legitimately **raises** coverage, run `make test-coverage-baseline` and **commit** the updated `coverage-baseline.txt` so the ratchet moves up. Never lower the baseline by hand.
|
||||
- If you can't get coverage back to baseline, the fix is to **add tests**, not to edit the baseline.
|
||||
Rules (both gates):
|
||||
- **Install the hooks:** `make install-hooks` once per clone so lint + coverage run pre-commit. Don't lean on CI for what the hook catches.
|
||||
- **Don't work around the gate:** never `git commit --no-verify`, and never hand-lower a baseline or widen a tolerance to turn a red gate green. The ratchet only moves up.
|
||||
- If a change drops coverage, **add tests** (sort `coverage-summary.json` by line% ascending to find untested code) rather than editing the baseline. When coverage legitimately rises, commit the regenerated baseline (`make test-coverage-baseline` / `test-ui-coverage-baseline`).
|
||||
- The Go gate is **strict — no tolerance**; `covermode=atomic` keeps it deterministic. The UI gate keeps a small tolerance only because its e2e coverage isn't.
|
||||
|
||||
@@ -35,6 +35,7 @@ LocalAI follows the Linux kernel project's [guidelines for AI coding assistants]
|
||||
|
||||
## Quick Reference
|
||||
|
||||
- **Git hooks & coverage gates**: Run `make install-hooks` once per clone so the pre-commit lint + coverage gates run. **Never bypass them with `git commit --no-verify`, and never lower a coverage baseline or widen a gate's tolerance to turn a red gate green** — the coverage ratchet only moves up. If a change drops coverage, add tests to raise it (e.g. render-smoke specs). See [.agents/building-and-testing.md](.agents/building-and-testing.md).
|
||||
- **Logging**: Use `github.com/mudler/xlog` (same API as slog)
|
||||
- **Go style**: Prefer `any` over `interface{}`
|
||||
- **Comments**: Explain *why*, not *what*
|
||||
|
||||
@@ -266,6 +266,12 @@ The e2e tests run LocalAI in a Docker container and exercise the API:
|
||||
make test-e2e
|
||||
```
|
||||
|
||||
### React UI tests and coverage
|
||||
|
||||
The React UI (`core/http/react-ui/`) is covered by Playwright e2e specs, gated by a **monotonic line-coverage ratchet** (`make test-ui-coverage-check`, run in CI and pre-commit). The metric is non-deterministic — a fast local box reads higher than a slow CI runner for the same code — so a small tolerance is unavoidable.
|
||||
|
||||
**If your change lowers UI coverage, raise it back by adding specs — do not widen the tolerance or hand-lower the baseline.** A *render-smoke* spec (navigate to a page, assert its header is visible) cheaply covers an entire lazy page. See `core/http/react-ui/e2e/page-render-smoke.spec.js` and the full policy in [.agents/building-and-testing.md](.agents/building-and-testing.md#react-ui-coverage).
|
||||
|
||||
### Running E2E container tests
|
||||
|
||||
These tests build a standard LocalAI Docker image and run it with pre-configured model configs to verify that most endpoints work correctly:
|
||||
|
||||
18
README.md
@@ -31,12 +31,18 @@
|
||||
|
||||
**LocalAI** is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
|
||||
|
||||
- **Drop-in API compatibility** — OpenAI, Anthropic, ElevenLabs APIs
|
||||
- **36+ backends** — llama.cpp, vLLM, transformers, whisper, diffusers, MLX...
|
||||
- **Any hardware** — NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or CPU-only
|
||||
- **Multi-user ready** — API key auth, user quotas, role-based access
|
||||
- **Built-in AI agents** — autonomous agents with tool use, RAG, MCP, and skills
|
||||
- **Privacy-first** — your data never leaves your infrastructure
|
||||
**A small core, not a bundle.** Each backend wraps a best-in-class engine (llama.cpp, vLLM, whisper.cpp, stable-diffusion, MLX...) in its own image, pulled only when a model needs it. You install nothing you don't use.
|
||||
|
||||
- **Composable by design**: backends are separate and pulled on demand, so you install only what your model needs
|
||||
- **Open and extensible**: load any model, or build your own backend in any language against an open interface
|
||||
- **Drop-in API compatibility**: OpenAI, Anthropic, and ElevenLabs APIs across every backend
|
||||
- **Any model, any modality**: LLMs, vision, voice, image, and video behind one API
|
||||
- **Any hardware**: NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or CPU-only
|
||||
- **Multi-user ready**: API key auth, user quotas, role-based access
|
||||
- **Built-in AI agents**: autonomous agents with tool use, RAG, MCP, and skills
|
||||
- **Privacy-first**: your data never leaves your infrastructure
|
||||
|
||||

|
||||
|
||||
Created by [Ettore Di Giacinto](https://github.com/mudler) and maintained by the [LocalAI team](#team).
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
|
||||
LLAMA_VERSION?=d6588daa800058dfa54f1d7ea695b1a810c8ae18
|
||||
LLAMA_VERSION?=5dcb71166686799f0d873eab7386234302d05ecf
|
||||
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
|
||||
|
||||
CMAKE_ARGS?=
|
||||
|
||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
||||
|
||||
# CrispASR version (release tag)
|
||||
CRISPASR_REPO?=https://github.com/CrispStrobe/CrispASR
|
||||
CRISPASR_VERSION?=v0.6.11
|
||||
CRISPASR_VERSION?=05e60432bcb5bc2113f8c395a41e86497c11504a
|
||||
SO_TARGET?=libgocrispasr.so
|
||||
|
||||
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# parakeet-cpp backend Makefile.
|
||||
#
|
||||
# Upstream pin lives below as PARAKEET_VERSION?=cb45f68068081af01e7092e91b038ee353eb56be
|
||||
# Upstream pin lives below as PARAKEET_VERSION?=9edf17c3ada66e0f881dcff155492867db7ac4cf
|
||||
# (.github/bump_deps.sh) can find and update it - matches the
|
||||
# whisper.cpp / ds4 / vibevoice-cpp convention.
|
||||
#
|
||||
@@ -15,7 +15,7 @@
|
||||
# That's what the L0 smoke test uses. The default target below does the
|
||||
# proper clone-at-pin + cmake build so CI doesn't need a side-checkout.
|
||||
|
||||
PARAKEET_VERSION?=cb45f68068081af01e7092e91b038ee353eb56be
|
||||
PARAKEET_VERSION?=9edf17c3ada66e0f881dcff155492867db7ac4cf
|
||||
PARAKEET_REPO?=https://github.com/mudler/parakeet.cpp
|
||||
|
||||
GOCMD?=go
|
||||
|
||||
79
backend/go/parakeet-cpp/batcher.go
Normal file
@@ -0,0 +1,79 @@
|
||||
package main
|
||||
|
||||
import "time"
|
||||
|
||||
// batchRequest is one in-flight unary transcription waiting to be batched.
|
||||
// In production pcm/decoder are set; tag is an opaque marker used by tests.
|
||||
type batchRequest struct {
|
||||
pcm []float32
|
||||
decoder int32
|
||||
tag string
|
||||
reply chan batchReply
|
||||
}
|
||||
|
||||
// batchReply carries one per-item JSON object string (an element of the C-API's
|
||||
// JSON array) or an error back to the waiting handler goroutine.
|
||||
type batchReply struct {
|
||||
json string
|
||||
err error
|
||||
}
|
||||
|
||||
// batcher coalesces concurrent batchRequests into batched runBatch calls. A
|
||||
// single run() goroutine is the sole caller of runBatch, so runBatch (which in
|
||||
// production calls the thread-unsafe C engine) is never entered concurrently.
|
||||
type batcher struct {
|
||||
submit chan *batchRequest
|
||||
maxSize int
|
||||
maxWait time.Duration
|
||||
runBatch func(reqs []*batchRequest) // must deliver a reply to every req
|
||||
}
|
||||
|
||||
func newBatcher(maxSize int, maxWait time.Duration, runBatch func([]*batchRequest)) *batcher {
|
||||
if maxSize < 1 {
|
||||
maxSize = 1
|
||||
}
|
||||
return &batcher{
|
||||
submit: make(chan *batchRequest),
|
||||
maxSize: maxSize,
|
||||
maxWait: maxWait,
|
||||
runBatch: runBatch,
|
||||
}
|
||||
}
|
||||
|
||||
// run is the dispatcher loop: accumulate submitted requests until either maxSize
|
||||
// is reached or maxWait elapses since the first queued request, then dispatch.
|
||||
// Exits when stop is closed (draining any partially-filled batch first).
|
||||
func (b *batcher) run(stop <-chan struct{}) {
|
||||
for {
|
||||
var first *batchRequest
|
||||
select {
|
||||
case first = <-b.submit:
|
||||
case <-stop:
|
||||
return
|
||||
}
|
||||
batch := []*batchRequest{first}
|
||||
|
||||
// maxSize==1 disables batching: dispatch immediately (passthrough).
|
||||
if b.maxSize == 1 {
|
||||
b.runBatch(batch)
|
||||
continue
|
||||
}
|
||||
|
||||
timer := time.NewTimer(b.maxWait)
|
||||
fill:
|
||||
for len(batch) < b.maxSize {
|
||||
select {
|
||||
case r := <-b.submit:
|
||||
batch = append(batch, r)
|
||||
case <-timer.C:
|
||||
break fill
|
||||
case <-stop:
|
||||
timer.Stop()
|
||||
b.runBatch(batch)
|
||||
return
|
||||
}
|
||||
}
|
||||
timer.Stop()
|
||||
b.runBatch(batch)
|
||||
}
|
||||
}
|
||||
108
backend/go/parakeet-cpp/batcher_test.go
Normal file
@@ -0,0 +1,108 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
. "github.com/onsi/ginkgo/v2"
|
||||
. "github.com/onsi/gomega"
|
||||
)
|
||||
|
||||
var _ = Describe("batcher", func() {
|
||||
echoReply := func(reqs []*batchRequest) {
|
||||
for _, r := range reqs {
|
||||
r.reply <- batchReply{json: r.tag}
|
||||
}
|
||||
}
|
||||
|
||||
It("coalesces concurrent submits into batches", func() {
|
||||
var mu sync.Mutex
|
||||
var sizes []int
|
||||
run := func(reqs []*batchRequest) {
|
||||
mu.Lock()
|
||||
sizes = append(sizes, len(reqs))
|
||||
mu.Unlock()
|
||||
echoReply(reqs)
|
||||
}
|
||||
b := newBatcher(4, 50*time.Millisecond, run)
|
||||
stop := make(chan struct{})
|
||||
go b.run(stop)
|
||||
defer close(stop)
|
||||
|
||||
const N = 4
|
||||
var wg sync.WaitGroup
|
||||
got := make([]string, N)
|
||||
for i := 0; i < N; i++ {
|
||||
wg.Add(1)
|
||||
go func(i int) {
|
||||
defer wg.Done()
|
||||
rep := make(chan batchReply, 1)
|
||||
b.submit <- &batchRequest{tag: string(rune('a' + i)), reply: rep}
|
||||
got[i] = (<-rep).json
|
||||
}(i)
|
||||
}
|
||||
wg.Wait()
|
||||
|
||||
mu.Lock()
|
||||
defer mu.Unlock()
|
||||
total, maxBatch := 0, 0
|
||||
for _, s := range sizes {
|
||||
total += s
|
||||
if s > maxBatch {
|
||||
maxBatch = s
|
||||
}
|
||||
}
|
||||
Expect(total).To(Equal(N))
|
||||
Expect(maxBatch).To(BeNumerically(">=", 2), "expected at least one batch to coalesce >1 request")
|
||||
})
|
||||
|
||||
It("dispatches when max size is reached", func() {
|
||||
dispatched := make(chan int, 8)
|
||||
run := func(reqs []*batchRequest) {
|
||||
dispatched <- len(reqs)
|
||||
echoReply(reqs)
|
||||
}
|
||||
b := newBatcher(2, time.Hour, run) // huge window: only size can trigger
|
||||
stop := make(chan struct{})
|
||||
go b.run(stop)
|
||||
defer close(stop)
|
||||
for i := 0; i < 2; i++ {
|
||||
rep := make(chan batchReply, 1)
|
||||
b.submit <- &batchRequest{tag: "x", reply: rep}
|
||||
go func(rep chan batchReply) { <-rep }(rep)
|
||||
}
|
||||
Eventually(dispatched, "2s").Should(Receive(Equal(2)))
|
||||
})
|
||||
|
||||
It("dispatches when the wait window elapses", func() {
|
||||
dispatched := make(chan int, 8)
|
||||
run := func(reqs []*batchRequest) {
|
||||
dispatched <- len(reqs)
|
||||
echoReply(reqs)
|
||||
}
|
||||
b := newBatcher(8, 20*time.Millisecond, run) // size unreachable; window fires
|
||||
stop := make(chan struct{})
|
||||
go b.run(stop)
|
||||
defer close(stop)
|
||||
rep := make(chan batchReply, 1)
|
||||
b.submit <- &batchRequest{tag: "x", reply: rep}
|
||||
go func() { <-rep }()
|
||||
Eventually(dispatched, "2s").Should(Receive(Equal(1)))
|
||||
})
|
||||
|
||||
It("bypasses batching when max size is 1", func() {
|
||||
dispatched := make(chan int, 8)
|
||||
run := func(reqs []*batchRequest) {
|
||||
dispatched <- len(reqs)
|
||||
echoReply(reqs)
|
||||
}
|
||||
b := newBatcher(1, time.Hour, run) // size 1 => immediate dispatch
|
||||
stop := make(chan struct{})
|
||||
go b.run(stop)
|
||||
defer close(stop)
|
||||
rep := make(chan batchReply, 1)
|
||||
b.submit <- &batchRequest{tag: "x", reply: rep}
|
||||
go func() { <-rep }()
|
||||
Eventually(dispatched, "2s").Should(Receive(Equal(1)))
|
||||
})
|
||||
})
|
||||
@@ -7,13 +7,17 @@ import (
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
"unsafe"
|
||||
|
||||
"github.com/go-audio/wav"
|
||||
"github.com/mudler/LocalAI/pkg/grpc/base"
|
||||
pb "github.com/mudler/LocalAI/pkg/grpc/proto"
|
||||
"github.com/mudler/LocalAI/pkg/utils"
|
||||
"github.com/mudler/xlog"
|
||||
"google.golang.org/grpc/codes"
|
||||
"google.golang.org/grpc/status"
|
||||
)
|
||||
@@ -34,6 +38,15 @@ var (
|
||||
CppFreeString func(s uintptr)
|
||||
CppLastError func(ctx uintptr) string
|
||||
|
||||
// Batched JSON transcription: takes a concatenated float buffer of clips
|
||||
// plus their per-clip sample counts (sum(nSamples)==len(samplesConcat))
|
||||
// and returns a malloc'd char* JSON ARRAY of per-clip {"text","words",
|
||||
// "tokens"} objects (uintptr, freed via CppFreeString). purego passes the
|
||||
// Go slices as the base pointer of their backing array (kept alive for the
|
||||
// call), matching the CppStreamFeed pcm []float32 binding pattern; the C
|
||||
// side reads them as const float*/const int*.
|
||||
CppTranscribePcmBatchJSON func(ctx uintptr, samplesConcat []float32, nSamples []int32, nClips int32, sampleRate int32, decoder int32) uintptr
|
||||
|
||||
// Cache-aware streaming (RNN-T) entry points. stream_begin returns 0 for
|
||||
// non-streaming models. feed/finalize return a malloc'd char* (uintptr,
|
||||
// freed via CppFreeString); feed writes 1 to *eouOut on an <EOU>/<EOB>.
|
||||
@@ -77,11 +90,18 @@ type transcriptToken struct {
|
||||
}
|
||||
|
||||
// ParakeetCpp owns a single loaded parakeet_ctx. The C engine is a
|
||||
// thread-unsafe singleton (mirrors whisper.cpp / vibevoice.cpp), so we
|
||||
// serialize calls through base.SingleThread.
|
||||
// thread-unsafe singleton (mirrors whisper.cpp / vibevoice.cpp). Rather than
|
||||
// serialize every call through base.SingleThread, we route unary
|
||||
// transcription through an in-process batcher (its sole dispatcher goroutine
|
||||
// is the only caller of the engine on that path) and guard the shared engine
|
||||
// with engineMu so a streaming session and a batched-unary dispatch never
|
||||
// touch it concurrently.
|
||||
type ParakeetCpp struct {
|
||||
base.SingleThread
|
||||
ctxPtr uintptr
|
||||
base.Base
|
||||
ctxPtr uintptr
|
||||
engineMu sync.Mutex // sole guard of the one C engine (dispatcher + streaming)
|
||||
bat *batcher
|
||||
batStop chan struct{}
|
||||
}
|
||||
|
||||
// Load is the LocalAI gRPC entry point for LoadModel: it calls
|
||||
@@ -100,13 +120,103 @@ func (p *ParakeetCpp) Load(opts *pb.ModelOptions) error {
|
||||
return fmt.Errorf("parakeet-cpp: parakeet_capi_load failed for %q", opts.ModelFile)
|
||||
}
|
||||
p.ctxPtr = ctx
|
||||
|
||||
// Dynamic batching knobs (model YAML options:, key:value form). Batching is
|
||||
// OFF by default (batch_max_size:1): each request runs on its own. On GPU,
|
||||
// raising batch_max_size coalesces concurrent requests into one batched
|
||||
// engine call and improves throughput under load; leave it at 1 on CPU and
|
||||
// for low-concurrency setups, where batching only adds latency.
|
||||
maxSize := optInt(opts, "batch_max_size", 1)
|
||||
maxWaitMs := optInt(opts, "batch_max_wait_ms", 15)
|
||||
if maxWaitMs < 0 {
|
||||
maxWaitMs = 0
|
||||
}
|
||||
if CppTranscribePcmBatchJSON != nil {
|
||||
p.batStop = make(chan struct{})
|
||||
p.bat = newBatcher(maxSize, time.Duration(maxWaitMs)*time.Millisecond, p.runBatch)
|
||||
go p.bat.run(p.batStop) // dispatcher runs until Free closes batStop
|
||||
if maxSize > 1 {
|
||||
xlog.Info("parakeet-cpp: dynamic batching enabled",
|
||||
"batch_max_size", maxSize, "batch_max_wait_ms", maxWaitMs)
|
||||
} else {
|
||||
xlog.Info("parakeet-cpp: dynamic batching off (batch_max_size=1); " +
|
||||
"set batch_max_size>1 to coalesce concurrent requests on GPU")
|
||||
}
|
||||
} else {
|
||||
xlog.Info("parakeet-cpp: batched C-API not present in libparakeet.so; " +
|
||||
"batching disabled, using per-request transcription")
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// AudioTranscription runs parakeet_capi_transcribe_path_json on the wav at
|
||||
// opts.Dst with the default decoder (decoder=0, which selects the right head
|
||||
// per architecture: transducer for tdt/rnnt/hybrid, CTC for ctc) and shapes
|
||||
// the per-word timestamps into a LocalAI TranscriptResult.
|
||||
// optInt reads an integer model option (key:value form) from ModelOptions,
|
||||
// returning def when absent or unparseable. The options array carries the
|
||||
// model YAML's options: entries (see core/config; siblings such as
|
||||
// acestep-cpp parse the same key:value form via strings.Cut on ":").
|
||||
func optInt(opts *pb.ModelOptions, key string, def int) int {
|
||||
for _, o := range opts.GetOptions() {
|
||||
k, v, ok := strings.Cut(o, ":")
|
||||
if ok && strings.TrimSpace(k) == key {
|
||||
if n, err := strconv.Atoi(strings.TrimSpace(v)); err == nil {
|
||||
return n
|
||||
}
|
||||
}
|
||||
}
|
||||
return def
|
||||
}
|
||||
|
||||
// runBatch is the dispatcher's batch handler and the ONLY caller of the C
|
||||
// engine on the unary path. It concatenates the batch PCM, calls the batched
|
||||
// JSON C-API under engineMu, splits the JSON array, and replies to each request.
|
||||
func (p *ParakeetCpp) runBatch(reqs []*batchRequest) {
|
||||
// Observability: the actual coalesced batch size per engine call. Debug-level
|
||||
// so it stays silent in normal operation but lets operators confirm/tune batching.
|
||||
xlog.Debug("parakeet-cpp: dispatching batch", "size", len(reqs))
|
||||
nSamples := make([]int32, len(reqs))
|
||||
total := 0
|
||||
for i, r := range reqs {
|
||||
nSamples[i] = int32(len(r.pcm))
|
||||
total += len(r.pcm)
|
||||
}
|
||||
concat := make([]float32, 0, total)
|
||||
for _, r := range reqs {
|
||||
concat = append(concat, r.pcm...)
|
||||
}
|
||||
var dec int32
|
||||
if len(reqs) > 0 {
|
||||
dec = reqs[0].decoder
|
||||
}
|
||||
p.engineMu.Lock()
|
||||
cstr := CppTranscribePcmBatchJSON(p.ctxPtr, concat, nSamples, int32(len(reqs)), 16000, dec)
|
||||
p.engineMu.Unlock()
|
||||
if cstr == 0 {
|
||||
err := fmt.Errorf("parakeet-cpp: batch transcribe failed: %s", CppLastError(p.ctxPtr))
|
||||
for _, r := range reqs {
|
||||
r.reply <- batchReply{err: err}
|
||||
}
|
||||
return
|
||||
}
|
||||
raw := goStringFromCPtr(cstr)
|
||||
CppFreeString(cstr)
|
||||
var docs []json.RawMessage
|
||||
if err := json.Unmarshal([]byte(raw), &docs); err != nil || len(docs) != len(reqs) {
|
||||
e := fmt.Errorf("parakeet-cpp: batch json: got %d results for %d reqs (%v)", len(docs), len(reqs), err)
|
||||
for _, r := range reqs {
|
||||
r.reply <- batchReply{err: e}
|
||||
}
|
||||
return
|
||||
}
|
||||
for i, r := range reqs {
|
||||
r.reply <- batchReply{json: string(docs[i])}
|
||||
}
|
||||
}
|
||||
|
||||
// AudioTranscription decodes the wav at opts.Dst to 16 kHz mono PCM and
|
||||
// submits it to the in-process batcher, which coalesces concurrent requests
|
||||
// into a single batched engine call (parakeet_capi_transcribe_pcm_batch_json)
|
||||
// with the default decoder (decoder=0, which selects the right head per
|
||||
// architecture: transducer for tdt/rnnt/hybrid, CTC for ctc) and shapes the
|
||||
// per-word timestamps into a LocalAI TranscriptResult.
|
||||
//
|
||||
// Parakeet emits word- and token-level timestamps but no native segment
|
||||
// boundaries, so we synthesise a single whole-clip segment spanning the first
|
||||
@@ -118,7 +228,7 @@ func (p *ParakeetCpp) Load(opts *pb.ModelOptions) error {
|
||||
// translate/diarize/prompt/temperature/language/threads are not applicable to
|
||||
// parakeet and are ignored; streaming is handled by AudioTranscriptionStream
|
||||
// (L2).
|
||||
func (p *ParakeetCpp) AudioTranscription(_ context.Context, opts *pb.TranscriptRequest) (pb.TranscriptResult, error) {
|
||||
func (p *ParakeetCpp) AudioTranscription(ctx context.Context, opts *pb.TranscriptRequest) (pb.TranscriptResult, error) {
|
||||
if p.ctxPtr == 0 {
|
||||
return pb.TranscriptResult{}, errors.New("parakeet-cpp: model not loaded")
|
||||
}
|
||||
@@ -126,61 +236,74 @@ func (p *ParakeetCpp) AudioTranscription(_ context.Context, opts *pb.TranscriptR
|
||||
return pb.TranscriptResult{}, errors.New("parakeet-cpp: TranscriptRequest.dst (audio path) is required")
|
||||
}
|
||||
|
||||
cstr := CppTranscribePathJSON(p.ctxPtr, opts.Dst, 0)
|
||||
if cstr == 0 {
|
||||
msg := CppLastError(p.ctxPtr)
|
||||
if msg == "" {
|
||||
msg = "unknown error"
|
||||
// Fallback when the batched C-API is unavailable: transcribe directly from
|
||||
// the file path (original behavior, no batching).
|
||||
if p.bat == nil {
|
||||
cstr := CppTranscribePathJSON(p.ctxPtr, opts.Dst, 0)
|
||||
if cstr == 0 {
|
||||
return pb.TranscriptResult{}, fmt.Errorf("parakeet-cpp: transcribe_path_json failed: %s", CppLastError(p.ctxPtr))
|
||||
}
|
||||
return pb.TranscriptResult{}, fmt.Errorf("parakeet-cpp: transcribe_path_json failed: %s", msg)
|
||||
raw := goStringFromCPtr(cstr)
|
||||
CppFreeString(cstr)
|
||||
var doc transcriptJSON
|
||||
if err := json.Unmarshal([]byte(raw), &doc); err != nil {
|
||||
return pb.TranscriptResult{}, fmt.Errorf("parakeet-cpp: decode transcript json: %w", err)
|
||||
}
|
||||
return transcriptResultFromDoc(doc, opts), nil
|
||||
}
|
||||
|
||||
raw := goStringFromCPtr(cstr)
|
||||
CppFreeString(cstr)
|
||||
|
||||
// Batched path: decode to PCM, submit to the batcher, wait for this request's
|
||||
// JSON element. The dispatcher is the sole engine caller on this path; both
|
||||
// sends honour ctx cancellation.
|
||||
pcm, _, err := decodeWavMono16k(opts.Dst)
|
||||
if err != nil {
|
||||
return pb.TranscriptResult{}, err
|
||||
}
|
||||
rep := make(chan batchReply, 1)
|
||||
select {
|
||||
case p.bat.submit <- &batchRequest{pcm: pcm, decoder: 0, reply: rep}:
|
||||
case <-ctx.Done():
|
||||
return pb.TranscriptResult{}, status.Error(codes.Canceled, "transcription cancelled")
|
||||
}
|
||||
var res batchReply
|
||||
select {
|
||||
case res = <-rep:
|
||||
case <-ctx.Done():
|
||||
return pb.TranscriptResult{}, status.Error(codes.Canceled, "transcription cancelled")
|
||||
}
|
||||
if res.err != nil {
|
||||
return pb.TranscriptResult{}, res.err
|
||||
}
|
||||
var doc transcriptJSON
|
||||
if err := json.Unmarshal([]byte(raw), &doc); err != nil {
|
||||
if err := json.Unmarshal([]byte(res.json), &doc); err != nil {
|
||||
return pb.TranscriptResult{}, fmt.Errorf("parakeet-cpp: decode transcript json: %w", err)
|
||||
}
|
||||
return transcriptResultFromDoc(doc, opts), nil
|
||||
}
|
||||
|
||||
// transcriptResultFromDoc maps a decoded transcriptJSON to a TranscriptResult,
|
||||
// synthesising a single whole-clip segment and attaching word timings only when
|
||||
// the caller requested word granularity. Shared by the batched and direct paths.
|
||||
func transcriptResultFromDoc(doc transcriptJSON, opts *pb.TranscriptRequest) pb.TranscriptResult {
|
||||
text := strings.TrimSpace(doc.Text)
|
||||
|
||||
words := make([]*pb.TranscriptWord, 0, len(doc.Words))
|
||||
for _, w := range doc.Words {
|
||||
words = append(words, &pb.TranscriptWord{
|
||||
Start: secondsToNanos(w.Start),
|
||||
End: secondsToNanos(w.End),
|
||||
Text: w.W,
|
||||
})
|
||||
words = append(words, &pb.TranscriptWord{Start: secondsToNanos(w.Start), End: secondsToNanos(w.End), Text: w.W})
|
||||
}
|
||||
|
||||
tokens := make([]int32, 0, len(doc.Tokens))
|
||||
for _, t := range doc.Tokens {
|
||||
tokens = append(tokens, t.ID)
|
||||
}
|
||||
|
||||
// Single whole-clip segment, spanning the first word start to the last
|
||||
// word end (0/0 when the clip produced no words).
|
||||
var segStart, segEnd int64
|
||||
if len(words) > 0 {
|
||||
segStart = words[0].Start
|
||||
segEnd = words[len(words)-1].End
|
||||
}
|
||||
seg := &pb.TranscriptSegment{
|
||||
Id: 0,
|
||||
Start: segStart,
|
||||
End: segEnd,
|
||||
Text: text,
|
||||
Tokens: tokens,
|
||||
}
|
||||
seg := &pb.TranscriptSegment{Id: 0, Start: segStart, End: segEnd, Text: text, Tokens: tokens}
|
||||
if wordsRequested(opts.TimestampGranularities) {
|
||||
seg.Words = words
|
||||
}
|
||||
|
||||
return pb.TranscriptResult{
|
||||
Text: text,
|
||||
Segments: []*pb.TranscriptSegment{seg},
|
||||
}, nil
|
||||
return pb.TranscriptResult{Text: text, Segments: []*pb.TranscriptSegment{seg}}
|
||||
}
|
||||
|
||||
// wordsRequested reports whether the caller asked for word-level timestamps.
|
||||
@@ -243,6 +366,14 @@ func (p *ParakeetCpp) AudioTranscriptionStream(ctx context.Context, opts *pb.Tra
|
||||
return nil
|
||||
}
|
||||
defer CppStreamFree(stream)
|
||||
// The C engine is a single shared context: a streaming session and a batched
|
||||
// unary dispatch must never touch it at once, so hold engineMu for the whole
|
||||
// stream. This lock is intentionally taken AFTER the non-streaming fallback
|
||||
// above returns: that fallback goes through AudioTranscription -> the batcher
|
||||
// -> runBatch, which itself acquires engineMu, so locking here first would
|
||||
// deadlock. Do not hoist this lock above the fallback.
|
||||
p.engineMu.Lock()
|
||||
defer p.engineMu.Unlock()
|
||||
|
||||
data, duration, err := decodeWavMono16k(opts.Dst)
|
||||
if err != nil {
|
||||
@@ -362,6 +493,12 @@ func decodeWavMono16k(path string) ([]float32, float32, error) {
|
||||
// Free releases the underlying parakeet_ctx. Called by LocalAI when the
|
||||
// model is unloaded.
|
||||
func (p *ParakeetCpp) Free() error {
|
||||
// Stop the dispatcher before releasing the engine so no in-flight runBatch
|
||||
// can touch a freed ctx (close leak / use-after-free on reload).
|
||||
if p.batStop != nil {
|
||||
close(p.batStop)
|
||||
p.batStop = nil
|
||||
}
|
||||
if p.ctxPtr != 0 {
|
||||
CppFree(p.ctxPtr)
|
||||
p.ctxPtr = 0
|
||||
|
||||
@@ -43,6 +43,9 @@ func ensureLibLoaded() {
|
||||
purego.RegisterLibFunc(&CppFree, lib, "parakeet_capi_free")
|
||||
purego.RegisterLibFunc(&CppTranscribePath, lib, "parakeet_capi_transcribe_path")
|
||||
purego.RegisterLibFunc(&CppTranscribePathJSON, lib, "parakeet_capi_transcribe_path_json")
|
||||
if sym, err := purego.Dlsym(lib, "parakeet_capi_transcribe_pcm_batch_json"); err == nil && sym != 0 {
|
||||
purego.RegisterLibFunc(&CppTranscribePcmBatchJSON, lib, "parakeet_capi_transcribe_pcm_batch_json")
|
||||
}
|
||||
purego.RegisterLibFunc(&CppStreamBegin, lib, "parakeet_capi_stream_begin")
|
||||
purego.RegisterLibFunc(&CppStreamFeed, lib, "parakeet_capi_stream_feed")
|
||||
purego.RegisterLibFunc(&CppStreamFinalize, lib, "parakeet_capi_stream_finalize")
|
||||
|
||||
@@ -58,6 +58,13 @@ func main() {
|
||||
purego.RegisterLibFunc(lf.FuncPtr, lib, lf.Name)
|
||||
}
|
||||
|
||||
// The batched-JSON entry point exists only in newer libparakeet.so (ABI >= 2).
|
||||
// Probe with Dlsym and register only if present, so the backend still loads
|
||||
// against an older library (it falls back to per-request transcription).
|
||||
if sym, err := purego.Dlsym(lib, "parakeet_capi_transcribe_pcm_batch_json"); err == nil && sym != 0 {
|
||||
purego.RegisterLibFunc(&CppTranscribePcmBatchJSON, lib, "parakeet_capi_transcribe_pcm_batch_json")
|
||||
}
|
||||
|
||||
fmt.Fprintf(os.Stderr, "[parakeet-cpp] ABI=%d\n", CppAbiVersion())
|
||||
|
||||
flag.Parse()
|
||||
|
||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
||||
|
||||
# stablediffusion.cpp (ggml)
|
||||
STABLEDIFFUSION_GGML_REPO?=https://github.com/leejet/stable-diffusion.cpp
|
||||
STABLEDIFFUSION_GGML_VERSION?=be65ac7511b30379b003626c15224798929e33d4
|
||||
STABLEDIFFUSION_GGML_VERSION?=7948df8ac1070f5f6881b8d34675821893eb97d6
|
||||
|
||||
CMAKE_ARGS+=-DGGML_MAX_NAME=128
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
||||
|
||||
# whisper.cpp version
|
||||
WHISPER_REPO?=https://github.com/ggml-org/whisper.cpp
|
||||
WHISPER_CPP_VERSION?=fe69461618ffc50ba8afa65c25cc6c6e34d4537f
|
||||
WHISPER_CPP_VERSION?=23ee03506a91ac3d3f0071b40e66a430eebdfa1d
|
||||
SO_TARGET?=libgowhisper.so
|
||||
|
||||
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
--extra-index-url https://download.pytorch.org/whl/cu130
|
||||
torch
|
||||
texterrors==1.1.6
|
||||
nemo_toolkit[asr]
|
||||
|
||||
@@ -1 +1 @@
|
||||
39.86
|
||||
40.0
|
||||
40
core/http/react-ui/e2e/page-render-smoke.spec.js
Normal file
@@ -0,0 +1,40 @@
|
||||
import { test, expect } from './coverage-fixtures.js'
|
||||
|
||||
// Render-smoke coverage. Each page is lazy-loaded and runs its full render +
|
||||
// initial effects on mount, so a bare visit captures the bulk of a page's
|
||||
// lines — cheap, real coverage for pages that have no dedicated spec yet.
|
||||
//
|
||||
// This is the project's preferred way to keep the UI coverage gate green:
|
||||
// raise the floor by covering more, rather than loosening the gate's
|
||||
// tolerance (see CONTRIBUTING.md → "React UI coverage"). Auth is disabled in
|
||||
// the test server, so RequireAdmin/RequireFeature resolve to isAdmin=true and
|
||||
// every gated route renders without an auth/capability mock.
|
||||
//
|
||||
// Asserts the page mounted (its .page-title header is visible) and that it did
|
||||
// not bounce to a gate redirect (/login or back to /app home).
|
||||
const PAGES = [
|
||||
['/app/talk', 'Talk'],
|
||||
['/app/usage', 'Usage'],
|
||||
['/app/account', 'Account'],
|
||||
['/app/studio', 'Studio'],
|
||||
['/app/manage', 'Manage'],
|
||||
['/app/backends', 'Backends'],
|
||||
['/app/settings', 'Settings'],
|
||||
['/app/nodes', 'Nodes'],
|
||||
['/app/face', 'Face recognition'],
|
||||
['/app/voice', 'Voice recognition'],
|
||||
['/app/fine-tune', 'Fine-tuning'],
|
||||
['/app/quantize', 'Quantize'],
|
||||
]
|
||||
|
||||
test.describe('Page render smoke', () => {
|
||||
for (const [path, label] of PAGES) {
|
||||
test(`renders ${label} (${path})`, async ({ page }) => {
|
||||
await page.goto(path)
|
||||
// .page-title for the normal header; .empty-state-title for pages that
|
||||
// render a gated/empty state (e.g. Account when auth is disabled).
|
||||
await expect(page.locator('.page-title, .empty-state-title').first()).toBeVisible({ timeout: 15_000 })
|
||||
await expect(page).toHaveURL(new RegExp(path.replace(/\//g, '\\/') + '$'))
|
||||
})
|
||||
}
|
||||
})
|
||||
@@ -1,10 +1,10 @@
|
||||
+++
|
||||
title = "LocalAI"
|
||||
description = "The free, OpenAI, Anthropic alternative. Your All-in-One Complete AI Stack"
|
||||
description = "The free, OpenAI and Anthropic alternative. A small, composable AI stack: run any model locally and install only what you use."
|
||||
type = "home"
|
||||
+++
|
||||
|
||||
**The free, OpenAI, Anthropic alternative. Your All-in-One Complete AI Stack** - Run powerful language models, autonomous agents, and document intelligence **locally** on your hardware.
|
||||
**The free, OpenAI and Anthropic alternative. A small, composable AI stack.** - Run powerful language models, autonomous agents, and document intelligence **locally** on your hardware. A lean core that pulls model backends on demand, so you install only what you use.
|
||||
|
||||
**No cloud, no limits, no compromise.**
|
||||
|
||||
|
||||
@@ -273,7 +273,7 @@ A list of the environment variable that tweaks parallelism is the following:
|
||||
```
|
||||
### Python backends GRPC max workers
|
||||
### Default number of workers for GRPC Python backends.
|
||||
### This actually controls wether a backend can process multiple requests or not.
|
||||
### This actually controls whether a backend can process multiple requests or not.
|
||||
|
||||
### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
|
||||
|
||||
|
||||
@@ -5,6 +5,8 @@ title = "Fine-tuning LLMs for text generation"
|
||||
weight = 22
|
||||
+++
|
||||
|
||||

|
||||
|
||||
{{% notice note %}}
|
||||
Section under construction
|
||||
{{% /notice %}}
|
||||
|
||||
@@ -4,6 +4,8 @@ description: Configure LocalAI behind a TLS termination reverse proxy (HAProxy,
|
||||
weight: 100
|
||||
---
|
||||
|
||||

|
||||
|
||||
# TLS Reverse Proxy Configuration
|
||||
|
||||
When running LocalAI behind a TLS termination reverse proxy, the Web UI may fail to load static assets (CSS, JS) correctly because the application doesn't automatically detect that it's being served over HTTPS. This guide explains how to properly configure your reverse proxy to work with LocalAI.
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 22
|
||||
url = '/advanced/vram-management'
|
||||
+++
|
||||
|
||||

|
||||
|
||||
When running multiple models in LocalAI, especially on systems with limited GPU memory (VRAM), you may encounter situations where loading a new model fails because there isn't enough available VRAM. LocalAI provides several mechanisms to automatically manage model memory allocation and prevent VRAM exhaustion:
|
||||
|
||||
1. **Max Active Backends (LRU Eviction)**: Limit the number of loaded models, evicting the least recently used when the limit is reached
|
||||
|
||||
@@ -12,6 +12,22 @@ url = "/faq/"
|
||||
Here are answers to some of the most common questions.
|
||||
|
||||
|
||||
### Do I need to install all the backends?
|
||||
|
||||
No. You install only the backends your models use. LocalAI's core is a single binary (or container) that provides the OpenAI-compatible API, request routing, the web UI, and agents. Each inference backend (llama.cpp, vLLM, whisper.cpp, stable-diffusion, MLX, and others) is a separate artifact, installed only when a model needs it.
|
||||
|
||||
In practice:
|
||||
|
||||
- **You install one backend, not all of them.** Run a model with `local-ai run <model>` and the matching backend is pulled automatically; nothing else is downloaded.
|
||||
- **Each backend is purpose-built for its engine.** LocalAI builds a dedicated gRPC backend around each engine, so every one stays independently optimized without a single binary trying to support every model architecture at once.
|
||||
- **You manage backends individually** with `local-ai backends list/install/uninstall` or from the web UI.
|
||||
|
||||
The catalog's breadth is optionality: you only ever run what your models use.
|
||||
|
||||
### Can I bring my own model or backend?
|
||||
|
||||
Yes. You can load any compatible model, not just the ones in the gallery. And because every backend talks to the core over a simple gRPC interface, you can write your own backend in any language and plug it in, exactly how the built-in backends work. Nothing about the core is closed off, which gives you the flexibility to run precisely the stack you want.
|
||||
|
||||
### How do I get models?
|
||||
|
||||
Most gguf-based models should work, but newer models may require additions to the API. If a model doesn't work, please feel free to open up issues. However, be cautious about downloading models from the internet and directly onto your machine, as there may be security vulnerabilities in lama.cpp or ggml that could be maliciously exploited. Some models can be found on Hugging Face: https://huggingface.co/models?search=gguf, or models from gpt4all are compatible too: https://github.com/nomic-ai/gpt4all.
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 21
|
||||
url = '/features/agents'
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI includes a built-in agent platform powered by [LocalAGI](https://github.com/mudler/LocalAGI). Agents are autonomous AI entities that can reason, use tools, maintain memory, and interact with external services — all running locally as part of the LocalAI process.
|
||||
|
||||
## Overview
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 17
|
||||
url = "/features/audio-diarization/"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
Speaker diarization answers the question **"who spoke when?"** — given an audio clip with multiple speakers, it returns time-stamped segments labelled with a stable speaker ID (`SPEAKER_00`, `SPEAKER_01`, …).
|
||||
|
||||
LocalAI exposes this through the `/v1/audio/diarization` endpoint, modelled after `/v1/audio/transcriptions`. Two backends are supported today:
|
||||
|
||||
@@ -187,6 +187,22 @@ curl http://localhost:8080/v1/audio/transcriptions \
|
||||
|
||||
For real-time use, load a cache-aware streaming model (e.g. `realtime_eou_120m-v1-*.gguf`) and pass `-F stream=true`. Deltas are emitted as the audio is decoded, with end-of-utterance events closing each segment.
|
||||
|
||||
### Dynamic batching
|
||||
|
||||
The backend can coalesce concurrent transcription requests into a single batched engine call, which improves throughput on GPU when many requests arrive at once. Batching is **off by default** (`batch_max_size:1`, one request at a time); raise it to opt in. Two `options:` knobs control it:
|
||||
|
||||
```yaml
|
||||
name: parakeet-110m
|
||||
backend: parakeet-cpp
|
||||
parameters:
|
||||
model: tdt_ctc-110m-f16.gguf
|
||||
options:
|
||||
- batch_max_size:8 # max requests coalesced into one batch (default 1 = off)
|
||||
- batch_max_wait_ms:15 # how long to wait to fill a batch, in ms (default 15)
|
||||
```
|
||||
|
||||
By default each request runs on its own. Raise `batch_max_size` (for example 4 to 16) to enable batching; it pays off on GPU under concurrent load, where coalescing the per-step decode GEMMs across requests is a large throughput win. Leave it at 1 on CPU and for low-concurrency setups, where batching only adds latency. Batching only affects concurrent unary requests; streaming sessions always run on their own.
|
||||
|
||||
## See also
|
||||
|
||||
- [Audio Transform]({{< relref "audio-transform.md" >}}) — clean up the audio (echo cancellation, noise suppression, dereverberation) before passing it to a transcription model.
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 17
|
||||
url = "/features/audio-transform/"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
The audio-transform endpoints take **audio in** and emit **audio out**, optionally
|
||||
conditioned on a second reference audio signal. The category is generic by
|
||||
design — concrete operations include joint **acoustic echo cancellation +
|
||||
|
||||
@@ -7,6 +7,8 @@ tags = ["Proxy", "Cloud", "Routing", "Advanced"]
|
||||
categories = ["Features"]
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI can forward chat-completion and Anthropic Messages requests to an
|
||||
external provider instead of running them through the local gRPC backend
|
||||
pipeline. Configure a model with `backend: cloud-proxy` and a `proxy.upstream_url`,
|
||||
|
||||
@@ -13,28 +13,7 @@ Distributed mode requires authentication enabled with a **PostgreSQL** database
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────┐
|
||||
│ Load Balancer │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌──────────────┼──────────────┐
|
||||
│ │ │
|
||||
┌───────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
|
||||
│ Frontend #1 │ │ Frontend │ │ Frontend #N│
|
||||
│ (LocalAI) │ │ #2 │ │ (LocalAI) │
|
||||
└──────┬───────┘ └────┬─────┘ └─────┬──────┘
|
||||
│ │ │
|
||||
┌───────▼──────────────▼──────────────▼───────┐
|
||||
│ PostgreSQL + NATS │
|
||||
│ (node registry, jobs, coordination) │
|
||||
└───────┬──────────────┬──────────────┬───────┘
|
||||
│ │ │
|
||||
┌──────▼──────┐ ┌────▼─────┐ ┌─────▼──────┐
|
||||
│ Worker #1 │ │ Worker │ │ Worker #N │
|
||||
│ (generic) │ │ #2 │ │ (generic) │
|
||||
└─────────────┘ └──────────┘ └────────────┘
|
||||
```
|
||||

|
||||
|
||||
**Frontends** are stateless LocalAI instances that receive API requests and route them to worker nodes via the **SmartRouter**. All frontends share state through PostgreSQL and coordinate via NATS.
|
||||
|
||||
@@ -42,6 +21,8 @@ Distributed mode requires authentication enabled with a **PostgreSQL** database
|
||||
|
||||
### Scheduling Algorithm
|
||||
|
||||

|
||||
|
||||
The SmartRouter uses **idle-first** scheduling with **preemptive eviction**:
|
||||
1. If the model is already loaded on a node → use it (per-model gRPC address)
|
||||
2. If no node has the model → prefer nodes with enough free VRAM
|
||||
@@ -432,6 +413,8 @@ This is **not** routed through the SmartRouter: it is a model-internal split, co
|
||||
|
||||
### Topology
|
||||
|
||||

|
||||
|
||||
ds4 uses a **coordinator/worker** split:
|
||||
|
||||
- The **coordinator** owns tokenization, sampling, the prompt, and a low layer range (e.g. `0:19`). It is LocalAI's ds4 backend and **listens** on a host/port. Workers dial into it.
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 15
|
||||
url = "/features/distribute/"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
{{% notice tip %}}
|
||||
Looking for production-grade horizontal scaling with PostgreSQL and NATS? See [Distributed Mode]({{% relref "features/distributed-mode" %}}).
|
||||
{{% /notice %}}
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 14
|
||||
url = "/features/face-recognition/"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI supports face recognition through the `insightface` backend:
|
||||
face verification (1:1), face identification (1:N) against a built-in
|
||||
vector store, face embedding, face detection, demographic analysis
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 18
|
||||
url = '/features/fine-tuning/'
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI supports fine-tuning LLMs directly through the API and Web UI. Fine-tuning is powered by pluggable backends that implement a generic gRPC interface, allowing support for different training frameworks and model types.
|
||||
|
||||
## Supported Backends
|
||||
|
||||
@@ -199,7 +199,7 @@ Pipelines types available:
|
||||
|
||||
##### Advanced: Additional parameters
|
||||
|
||||
Additional arbitrarly parameters can be specified in the option field in key/value separated by `:`:
|
||||
Additional arbitrary parameters can be specified in the option field in key/value separated by `:`:
|
||||
|
||||
```yaml
|
||||
name: animagine-xl
|
||||
@@ -207,7 +207,7 @@ options:
|
||||
- "cfg_scale:6"
|
||||
```
|
||||
|
||||
**Note**: There is no complete parameter list. Any parameter can be passed arbitrarly and is passed to the model directly as argument to the pipeline. Different pipelines/implementations support different parameters.
|
||||
**Note**: There is no complete parameter list. Any parameter can be passed arbitrarily and is passed to the model directly as argument to the pipeline. Different pipelines/implementations support different parameters.
|
||||
|
||||
The example above, will result in the following python code when generating images:
|
||||
|
||||
@@ -342,4 +342,4 @@ diffusers:
|
||||
```bash
|
||||
(echo -n '{"prompt": "spiderman surfing","size": "512x512","model":"txt2vid"}') |
|
||||
curl -H "Content-Type: application/json" -X POST -d @- http://localhost:8080/v1/images/generations
|
||||
```
|
||||
```
|
||||
|
||||
@@ -7,6 +7,8 @@ tags = ["MCP", "Agents", "Tools", "Advanced"]
|
||||
categories = ["Features"]
|
||||
+++
|
||||
|
||||

|
||||
|
||||
|
||||
LocalAI now supports the **Model Context Protocol (MCP)**, enabling powerful agentic capabilities by connecting AI models to external tools and services. This feature allows your LocalAI models to interact with various MCP servers, providing access to real-time data, APIs, and specialized tools.
|
||||
|
||||
|
||||
@@ -7,6 +7,8 @@ tags = ["Routing", "Privacy", "PII", "Middleware", "Advanced"]
|
||||
categories = ["Features"]
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI ships a request-middleware layer that sits between the HTTP API and
|
||||
the backend dispatcher. Two subsystems share that layer because they share
|
||||
the same lifecycle hook: **PII filtering** scans the request body before it
|
||||
|
||||
@@ -7,6 +7,8 @@ tags = ["Proxy", "MITM", "Privacy", "Routing", "Advanced"]
|
||||
categories = ["Features"]
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI can act as a local HTTPS proxy that **redacts PII from your Claude
|
||||
Code, OpenAI Codex CLI, or any HTTPS client** without holding their API keys.
|
||||
The proxy intercepts only the LLM API endpoints you allowlist (default:
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 18
|
||||
url = '/features/mlx-distributed/'
|
||||
+++
|
||||
|
||||

|
||||
|
||||
MLX distributed inference allows you to split large language models across multiple Apple Silicon Macs (or other devices) for joint inference. Unlike federation (which distributes whole requests), MLX distributed splits a single model's layers across machines so they all participate in every forward pass.
|
||||
|
||||
## How It Works
|
||||
|
||||
@@ -6,6 +6,8 @@ weight = 17
|
||||
url = "/features/openai-functions/"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI supports running the OpenAI [functions and tools API](https://platform.openai.com/docs/api-reference/chat/create#chat-create-tools) across multiple backends. The OpenAI request shape is the same regardless of which backend runs your model — LocalAI is responsible for extracting structured tool calls from the model's output before returning the response.
|
||||
|
||||

|
||||
|
||||
@@ -4,6 +4,8 @@ title: "Realtime API"
|
||||
weight: 60
|
||||
---
|
||||
|
||||

|
||||
|
||||
LocalAI supports the [OpenAI Realtime API](https://platform.openai.com/docs/guides/realtime) which enables low-latency, multi-modal conversations (voice and text) over WebSocket.
|
||||
|
||||
To use the Realtime API, you need to configure a pipeline model that defines the components for Voice Activity Detection (VAD), Transcription (STT), Language Model (LLM), and Text-to-Speech (TTS).
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 19
|
||||
url = '/features/quantization/'
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI supports model quantization directly through the API and Web UI. Quantization converts HuggingFace models to GGUF format and compresses them to smaller sizes for efficient inference with llama.cpp.
|
||||
|
||||
{{% notice note %}}
|
||||
|
||||
@@ -6,6 +6,8 @@ weight = 11
|
||||
url = "/features/reranker/"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
A **reranking** model, often referred to as a cross-encoder, is a core component in the two-stage retrieval systems used in information retrieval and natural language processing tasks.
|
||||
Given a query and a set of documents, it will output similarity scores.
|
||||
|
||||
|
||||
@@ -516,7 +516,7 @@ The `llama.cpp` backend supports additional configuration options that can be sp
|
||||
| `cache_idle_slots` or `idle_slots_cache` | boolean | On a new task, save the previous slot's KV state into the prompt cache (and clear the slot) so a later request with the same prefix can warm-load it. Default: `true`. Auto-disabled by the server if `kv_unified=false` or `cache_ram=0`. | `cache_idle_slots:false` |
|
||||
| `n_ctx_checkpoints` or `ctx_checkpoints` | integer | Maximum number of context checkpoints per slot (used for partial-prefix recovery, e.g. SWA). Default: `32`. | `ctx_checkpoints:16` |
|
||||
| `checkpoint_min_step` or `checkpoint_min_spacing` (aliases: `checkpoint_every_nt`, `checkpoint_every_n_tokens`) | integer | Minimum spacing in tokens between context checkpoints. `0` disables the minimum-spacing gate. Default: `256`. (Renamed upstream from `checkpoint_every_nt`; semantics shifted from a fixed cadence to a minimum spacing.) | `checkpoint_min_step:1024` |
|
||||
| `split_mode` or `sm` | string | How to split the model across multiple GPUs: `none` (single GPU only), `layer` (default — split layers and KV across GPUs), `row` (split rows across GPUs), `tensor` (experimental tensor parallelism — requires `flash_attention: true`, no KV-cache quantization, manually set `context_size`, and a llama.cpp build that includes [#19378](https://github.com/ggml-org/llama.cpp/pull/19378)). | `split_mode:tensor` |
|
||||
| `split_mode` or `sm` | string | How to split the model across multiple GPUs: `none` (single GPU only), `layer` (default — split layers and KV across GPUs), `row` (split rows across GPUs), `tensor` (experimental tensor parallelism, requires `flash_attention: true`, manually set `context_size`, and a llama.cpp build that includes [#19378](https://github.com/ggml-org/llama.cpp/pull/19378); it historically also required KV-cache quantization to be disabled, but [#23792](https://github.com/ggml-org/llama.cpp/pull/23792) lifts that restriction so `cache_type_k`/`cache_type_v` quantization can be combined with tensor parallelism on builds that include it). | `split_mode:tensor` |
|
||||
|
||||
**Example configuration with options:**
|
||||
|
||||
@@ -897,7 +897,7 @@ The backend will automatically download the required files in order to run the m
|
||||
- `OVModelForCausalLM` requires OpenVINO IR [Text Generation](https://huggingface.co/models?library=openvino&pipeline_tag=text-generation) models from Hugging face
|
||||
- `OVModelForFeatureExtraction` works with any Safetensors Transformer [Feature Extraction](https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers,safetensors) model from Huggingface (Embedding Model)
|
||||
|
||||
Please note that streaming is currently not implemente in `AutoModelForCausalLM` for Intel GPU.
|
||||
Please note that streaming is currently not implemented in `AutoModelForCausalLM` for Intel GPU.
|
||||
AMD GPU support is not implemented.
|
||||
Although AMD CPU is not officially supported by OpenVINO there are reports that it works: YMMV.
|
||||
|
||||
@@ -1008,4 +1008,4 @@ template:
|
||||
|
||||
completion: |
|
||||
{{.Input}}
|
||||
```
|
||||
```
|
||||
|
||||
@@ -5,6 +5,8 @@ weight = 15
|
||||
url = "/features/voice-recognition/"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
LocalAI supports voice (speaker) recognition through the
|
||||
`speaker-recognition` backend: speaker verification (1:1), speaker
|
||||
identification (1:N) against a built-in vector store, speaker
|
||||
|
||||
@@ -6,6 +6,8 @@ icon = "hub"
|
||||
description = "Learn how to install, configure, and manage models in LocalAI"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
This section covers everything you need to know about installing and configuring models in LocalAI. You'll learn multiple methods to get models running.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
@@ -6,6 +6,8 @@ url = '/basics/getting_started/'
|
||||
icon = "rocket_launch"
|
||||
+++
|
||||
|
||||

|
||||
|
||||
**LocalAI** is a free, open-source alternative to OpenAI (Anthropic, etc.), functioning as a drop-in replacement REST API for local inferencing. It allows you to run [LLMs]({{% relref "features/text-generation" %}}), generate images, and produce audio, all locally or on-premises with consumer-grade hardware, supporting multiple model families and architectures.
|
||||
|
||||
LocalAI comes with a **built-in web interface** for chatting with models, managing installations, configuring AI agents, and more — no extra tools needed.
|
||||
|
||||
@@ -11,7 +11,9 @@ icon = "info"
|
||||
+++
|
||||
|
||||
|
||||
LocalAI is your complete AI stack for running AI models locally. It's designed to be simple, efficient, and accessible, providing a drop-in replacement for OpenAI's API while keeping your data private and secure.
|
||||
LocalAI is a composable AI stack for running models locally: a small core that speaks the OpenAI and Anthropic APIs, with each model backend added only when you need it. It's simple, efficient, and private by default, and a drop-in replacement that keeps your data on your own hardware.
|
||||
|
||||

|
||||
|
||||
## Why LocalAI?
|
||||
|
||||
@@ -21,22 +23,26 @@ In today's AI landscape, privacy, control, and flexibility are paramount. LocalA
|
||||
- **Complete Control**: Run models on your terms, with your hardware
|
||||
- **Open Source**: MIT licensed and community-driven
|
||||
- **Flexible Deployment**: From laptops to servers, with or without GPUs
|
||||
- **Extensible**: Add new models and features as needed
|
||||
- **Composable by design**: A small core, not a bundle. Backends are separate and installed on demand, so you only run what you use
|
||||
|
||||
## What's Included
|
||||
|
||||
LocalAI is a single binary (or container) that gives you everything you need:
|
||||
The LocalAI core is a single small binary (or container). It gives you everything you need to serve models, and pulls each model backend on demand, so you install only what you use:
|
||||
|
||||
- **OpenAI-compatible API** — Drop-in replacement for OpenAI, Anthropic, and Open Responses APIs
|
||||
- **Built-in Web Interface** — Chat, model management, agent creation, image generation, and system monitoring
|
||||
- **AI Agents** — Create autonomous agents with MCP (Model Context Protocol) tool support, directly from the UI
|
||||
- **Multiple Model Support** — LLMs, image generation, text-to-speech, speech-to-text, vision, embeddings, and more
|
||||
- **Any Model, Any Modality**: LLMs, image and video, text-to-speech, speech-to-text, vision, and embeddings, each on its own backend, pulled automatically when you load a model
|
||||
- **GPU Acceleration** — Automatic detection and support for NVIDIA, AMD, Intel, and Vulkan GPUs
|
||||
- **Distributed Mode** — Scale horizontally with worker nodes, P2P federation, and model sharding
|
||||
- **No GPU Required** — Runs on CPU with consumer-grade hardware
|
||||
|
||||
LocalAI integrates [LocalAGI](https://github.com/mudler/LocalAGI) (agent platform) and [LocalRecall](https://github.com/mudler/LocalRecall) (semantic memory) as built-in libraries — no separate installation needed.
|
||||
|
||||
Each backend is a dedicated gRPC service that LocalAI builds around a best-in-class engine (llama.cpp, vLLM, whisper.cpp, stable-diffusion, MLX, and more), exposing it through the unified API. Backends ship as standard OCI images and run as isolated processes, so each one can be installed, upgraded, or removed without touching the core, can even run on a separate machine, and a fault in one never brings down the rest.
|
||||
|
||||
Because the backend contract is a simple gRPC interface, the system is open: bring your own model, or write a custom backend in any language and plug it in, exactly how the built-in backends work. This is what keeps the core small and gives you the flexibility to run precisely the stack you want, instead of compiling every engine into one binary.
|
||||
|
||||
## Getting Started
|
||||
|
||||
LocalAI can be installed in several ways. **Docker is the recommended installation method** for most users as it provides the easiest setup and works across all platforms.
|
||||
|
||||
@@ -9,7 +9,7 @@ LocalAI is an API written in Go that serves as an OpenAI shim, enabling software
|
||||
|
||||
LocalAI uses a mixture of backends written in various languages (C++, Golang, Python, ...). You can check [the model compatibility table]({{%relref "reference/compatibility-table" %}}) to learn about all the components of LocalAI.
|
||||
|
||||

|
||||

|
||||
|
||||
|
||||
## Backstory
|
||||
|
||||
@@ -105,7 +105,7 @@ It is now possible for single-devices with one GPU to specify `--single-active-b
|
||||
|
||||
#### Resources management
|
||||
|
||||
Thanks to the continous community efforts (another cool contribution from {{< github "dave-gray101" >}} ) now it's possible to shutdown a backend programmatically via the API.
|
||||
Thanks to the continuous community efforts (another cool contribution from {{< github "dave-gray101" >}} ) now it's possible to shutdown a backend programmatically via the API.
|
||||
There is an ongoing effort in the community to better handling of resources. See also the [🔥Roadmap](https://localai.io/#-hot-topics--roadmap).
|
||||
|
||||
#### New how-to section
|
||||
|
||||
166
docs/static/images/diagrams/agents-loop.html
vendored
Normal file
@@ -0,0 +1,166 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Agents</div>
|
||||
<h1>The in-process <em>agent loop</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">SELF</div>
|
||||
<div class="s">hosted</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Agents call LocalAI's own chat API in a loop; progress streams back over SSE.</div>
|
||||
<div class="url">localai.io<span>/features/agents</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
function varrow(x1,y1,x2,y2,color,dash){
|
||||
const my=(y1+y2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${x1} ${my}, ${x2} ${my}, ${x2} ${y2-11}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2} ${y2-11} l -${a} -${a+4} M ${x2} ${y2-11} l ${a} -${a+4}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- USER (left) ----------
|
||||
const UX=20, UW=160, UH=72, UY=66;
|
||||
shadowRect(UX,UY,UW,UH,PAPER2);
|
||||
txt(UX+UW/2,UY+45,"User",{f:"Bricolage Grotesque",w:800,sz:26,a:"middle"});
|
||||
|
||||
// ---------- AGENT POOL ----------
|
||||
const APX=20, APW=200, APH=80, APY=240;
|
||||
shadowRect(APX,APY,APW,APH,HI);
|
||||
txt(APX+APW/2,APY+38,"AgentPool",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(APX+APW/2,APY+62,"core/services",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- REASONING LOOP (center) ----------
|
||||
const LX=300, LW=320, LY=180, LH=230;
|
||||
shadowRect(LX,LY,LW,LH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:LX,y:LY,width:LW,height:54,fill:COLD}));
|
||||
svg.appendChild(el("line",{x1:LX,y1:LY+54,x2:LX+LW,y2:LY+54,stroke:INK,"stroke-width":4}));
|
||||
txt(LX+22,LY+36,"Agent reasoning loop",{f:"Bricolage Grotesque",w:800,sz:23,fill:PAPER});
|
||||
txt(LX+LW/2,LY+92,"think → act → observe",{f:"Bricolage Grotesque",w:700,sz:19,a:"middle",fill:INK});
|
||||
txt(LX+LW/2,LY+120,"iterate until done",{w:700,sz:15,a:"middle",fill:SOFT});
|
||||
|
||||
// loop side-boxes inside the loop card
|
||||
const sbW=88, sbH=42, sbY=LY+LH-66;
|
||||
const sbItems=[{n:"Actions"},{n:"RAG"},{n:"MCP tools"}];
|
||||
const sbGap=(LW-44 - sbW*3)/2; let sbx=LX+22;
|
||||
sbItems.forEach((it)=>{
|
||||
svg.appendChild(el("rect",{x:sbx,y:sbY,width:sbW,height:sbH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(sbx+sbW/2,sbY+27,it.n,{f:"Bricolage Grotesque",w:700,sz:it.n.length>6?15:17,a:"middle"});
|
||||
sbx+=sbW+sbGap;
|
||||
});
|
||||
|
||||
// ---------- CHAT COMPLETIONS (right, rust) ----------
|
||||
const CCX=900, CCW=300, CCY=120, CCH=96;
|
||||
shadowRect(CCX,CCY,CCW,CCH,PAPER,RUST,4);
|
||||
svg.appendChild(el("rect",{x:CCX,y:CCY,width:CCW,height:40,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:CCX,y1:CCY+40,x2:CCX+CCW,y2:CCY+40,stroke:INK,"stroke-width":3.5}));
|
||||
txt(CCX+CCW/2,CCY+27,"LocalAI's own endpoint",{w:700,sz:14,a:"middle",fill:PAPER});
|
||||
txt(CCX+CCW/2,CCY+74,"POST /v1/chat/completions",{f:"Bricolage Grotesque",w:800,sz:19,a:"middle",fill:RUSTD});
|
||||
|
||||
// ---------- MODEL INFERENCE ----------
|
||||
const MIX=900, MIW=300, MIH=88, MIY=320;
|
||||
shadowRect(MIX,MIY,MIW,MIH,"#EFE0BF");
|
||||
txt(MIX+MIW/2,MIY+40,"Model inference",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(MIX+MIW/2,MIY+68,"backend · gRPC",{w:700,sz:15,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- WEB UI (bottom right) ----------
|
||||
const WX=1230, WW=180, WH=80, WY=440;
|
||||
shadowRect(WX,WY,WW,WH,PAPER2);
|
||||
txt(WX+WW/2,WY+38,"Web UI",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(WX+WW/2,WY+62,"live progress",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- SSE box ----------
|
||||
const SSX=900, SSW=170, SSH=58, SSY=451;
|
||||
svg.appendChild(el("rect",{x:SSX,y:SSY,width:SSW,height:SSH,fill:PAPER,stroke:COLD,"stroke-width":3,"stroke-dasharray":"4 7"}));
|
||||
txt(SSX+SSW/2,SSY+27,"GET /sse",{f:"Bricolage Grotesque",w:800,sz:19,a:"middle",fill:COLD});
|
||||
txt(SSX+SSW/2,SSY+47,"event stream",{w:700,sz:12,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- ARROWS ----------
|
||||
// User -> AgentPool (POST chat)
|
||||
arrow(UX+UW, UY+UH/2, APX+APW, APY+APH/2, INK);
|
||||
txt(UX+UW/2+10, 175, "POST /api/agents/:name/chat", {w:700,sz:12.5,a:"middle",fill:RUSTD});
|
||||
|
||||
// AgentPool -> reasoning loop
|
||||
arrow(APX+APW, APY+APH/2, LX, LY+LH/2, INK);
|
||||
|
||||
// reasoning loop -> chat completions (prominent rust, self-call)
|
||||
arrow(LX+LW, LY+44, CCX, CCY+CCH/2, RUST, "2 8");
|
||||
txt((LX+LW+CCX)/2+4, LY+8, "calls back into LocalAI", {w:700,sz:13,a:"middle",fill:RUSTD});
|
||||
|
||||
// chat completions -> model inference
|
||||
varrow(CCX+CCW/2, CCY+CCH, MIX+MIW/2, MIY, RUSTD);
|
||||
|
||||
// model inference -> back to loop (result returns)
|
||||
arrow(MIX, MIY+MIH/2, LX+LW, LY+LH-92, RUST, "2 8");
|
||||
txt((LX+LW+MIX)/2, MIY+MIH/2-12, "result returns", {w:700,sz:13,a:"middle",fill:RUSTD});
|
||||
|
||||
// reasoning loop -> SSE
|
||||
arrow(LX+LW, LY+LH-22, SSX, SSY+SSH/2, COLD, "3 7");
|
||||
txt((LX+LW+SSX)/2, SSY+SSH+22, "emits events", {w:700,sz:13,a:"middle",fill:COLD});
|
||||
|
||||
// SSE -> Web UI
|
||||
arrow(SSX+SSW, SSY+SSH/2, WX, WY+WH/2, COLD, "3 7");
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/agents-loop.png
vendored
Normal file
|
After Width: | Height: | Size: 240 KiB |
146
docs/static/images/diagrams/architecture-overview.html
vendored
Normal file
@@ -0,0 +1,146 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Architecture</div>
|
||||
<h1>How LocalAI <em>works</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">ONE API</div>
|
||||
<div class="s">many engines</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Clients speak one API. The core routes each request. <b>Every backend is a separate process, pulled only when a model needs it.</b></div>
|
||||
<div class="url">localai.io<span>/docs/overview</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- CLIENTS (left) ----------
|
||||
txt(24,46,"API CLIENTS",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
const clients=[{n:"OpenAI · Anthropic"},{n:"ElevenLabs · Ollama"},{n:"LocalAI API · curl"}];
|
||||
const CLX=24, CLW=210, CLH=66, clY=[80,210,340];
|
||||
clients.forEach((c,i)=>{
|
||||
shadowRect(CLX,clY[i],CLW,CLH,PAPER2);
|
||||
txt(CLX+CLW/2,clY[i]+41,c.n,{f:"Bricolage Grotesque",w:700,sz:18,a:"middle"});
|
||||
});
|
||||
|
||||
// ---------- CORE (center) ----------
|
||||
const COX=300, COW=470, COY=40, COH=490;
|
||||
shadowRect(COX,COY,COW,COH,PAPER,INK,4);
|
||||
// rust title bar
|
||||
svg.appendChild(el("rect",{x:COX,y:COY,width:COW,height:64,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:COX,y1:COY+64,x2:COX+COW,y2:COY+64,stroke:INK,"stroke-width":4}));
|
||||
txt(COX+26,COY+34,"LocalAI core",{f:"Bricolage Grotesque",w:800,sz:30,fill:PAPER});
|
||||
txt(COX+COW-26,COY+40,"one small binary",{w:700,sz:14,ls:".06em",a:"end",fill:"#F1D9C8"});
|
||||
// internal modules
|
||||
const mods=["Drop-in API server","Smart router","Web UI","Agents · LocalAGI","Memory · LocalRecall"];
|
||||
const MX=COX+26, MW=COW-52, MH=58; let my=COY+88;
|
||||
mods.forEach(m=>{
|
||||
svg.appendChild(el("rect",{x:MX,y:my,width:MW,height:MH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(MX+18,my+37,m,{f:"Bricolage Grotesque",w:700,sz:22});
|
||||
my+=MH+14;
|
||||
});
|
||||
|
||||
// ---------- gRPC boundary ----------
|
||||
const GX=836;
|
||||
const gbW=98,gbH=32,gbx=GX-gbW/2,gby=38;
|
||||
svg.appendChild(el("line",{x1:GX,y1:gby+gbH+6,x2:GX,y2:520,stroke:RUSTD,"stroke-width":3,"stroke-dasharray":"3 8"}));
|
||||
svg.appendChild(el("rect",{x:gbx,y:gby,width:gbW,height:gbH,fill:PAPER,stroke:RUSTD,"stroke-width":2.5}));
|
||||
txt(GX,gby+22,"gRPC",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",fill:RUSTD});
|
||||
|
||||
// ---------- BACKENDS (right, 2 x 3) ----------
|
||||
txt(1460,46,"BACKENDS",{w:700,sz:14,ls:".2em",a:"end",fill:SOFT});
|
||||
const beW=270, beH=120, beRows=[70,210,350];
|
||||
const beCols=[895,1180];
|
||||
const backs=[
|
||||
{n:"llama.cpp", s:"LLMs · GGUF"},
|
||||
{n:"vLLM", s:"high-throughput"},
|
||||
{n:"whisper.cpp", s:"speech to text"},
|
||||
{n:"stable-diffusion",s:"image & video"},
|
||||
{n:"MLX", s:"Apple Silicon"},
|
||||
{n:"+ gallery", s:"pulled on demand", more:true},
|
||||
];
|
||||
backs.forEach((b,i)=>{
|
||||
const col=beCols[i%2], row=beRows[Math.floor(i/2)];
|
||||
if(!b.more) shadowRect(col,row,beW,beH,"#EFE0BF");
|
||||
else { svg.appendChild(el("rect",{x:col,y:row,width:beW,height:beH,fill:PAPER,stroke:DIM,"stroke-width":3.5,"stroke-dasharray":"4 7"})); }
|
||||
txt(col+20,row+50,b.n,{f:"Bricolage Grotesque",w:800,sz:25,fill:b.more?SOFT:INK});
|
||||
txt(col+20,row+80,b.s,{w:700,sz:14,fill:b.more?DIM:SOFT});
|
||||
if(!b.more){
|
||||
const tw=132,th=24,tx=col+beW-tw-14,ty=row+beH-th-12;
|
||||
svg.appendChild(el("rect",{x:tx,y:ty,width:tw,height:th,fill:PAPER,stroke:INK,"stroke-width":2}));
|
||||
txt(tx+tw/2,ty+17,"OCI · ON DEMAND",{w:700,sz:11,ls:".08em",a:"middle",fill:RUSTD});
|
||||
}
|
||||
});
|
||||
|
||||
// ---------- ARROWS ----------
|
||||
// clients -> core
|
||||
clients.forEach((c,i)=> arrow(CLX+CLW, clY[i]+CLH/2, COX, clY[i]+CLH/2, INK));
|
||||
// core -> backends (dashed, gRPC), fan from core right-mid
|
||||
const srcY=COY+COH/2;
|
||||
backs.forEach((b,i)=>{
|
||||
const col=beCols[i%2], row=beRows[Math.floor(i/2)];
|
||||
arrow(COX+COW, srcY+(Math.floor(i/2)-1)*40, col, row+beH/2, b.more?DIM:RUSTD, "2 8");
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/architecture-overview.png
vendored
Normal file
|
After Width: | Height: | Size: 285 KiB |
172
docs/static/images/diagrams/audio-transform-io.html
vendored
Normal file
@@ -0,0 +1,172 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Audio Transform</div>
|
||||
<h1>Near + far in, <em>clean out</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">STEREO</div>
|
||||
<div class="s">wire</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Two inputs (mic + reference) transform to one cleaned output; <b>interleaved-stereo on the wire.</b></div>
|
||||
<div class="url">localai.io<span>/features/audio-transform</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ============================================================
|
||||
// TOP: block flow [audio] + [reference] -> [backend] -> [out]
|
||||
// ============================================================
|
||||
txt(20,30,"BLOCK FLOW",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
|
||||
// input boxes (left)
|
||||
const INW=250, INH=72;
|
||||
const INX=24;
|
||||
const audioY=58, refY=158;
|
||||
// primary audio (rust emphasis)
|
||||
shadowRect(INX,audioY,INW,INH,PAPER,RUST,4);
|
||||
txt(INX+22,audioY+34,"audio",{f:"Bricolage Grotesque",w:800,sz:26,fill:RUST});
|
||||
txt(INX+22,audioY+58,"primary · mic",{w:700,sz:14,fill:SOFT});
|
||||
// reference (cold, optional)
|
||||
svg.appendChild(el("rect",{x:INX+7,y:refY+7,width:INW,height:INH,fill:INK}));
|
||||
svg.appendChild(el("rect",{x:INX,y:refY,width:INW,height:INH,fill:PAPER,stroke:COLD,"stroke-width":4,"stroke-dasharray":"4 7"}));
|
||||
txt(INX+22,refY+34,"reference",{f:"Bricolage Grotesque",w:800,sz:26,fill:COLD});
|
||||
txt(INX+22,refY+58,"optional · far",{w:700,sz:14,fill:SOFT});
|
||||
|
||||
// backend (center)
|
||||
const BEW=270, BEH=140, BEX=560, BEY=66;
|
||||
shadowRect(BEX,BEY,BEW,BEH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:BEX,y:BEY,width:BEW,height:50,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:BEX,y1:BEY+50,x2:BEX+BEW,y2:BEY+50,stroke:INK,"stroke-width":3.5}));
|
||||
txt(BEX+22,BEY+34,"backend",{f:"Bricolage Grotesque",w:800,sz:26,fill:PAPER});
|
||||
txt(BEX+22,BEY+86,"transform",{f:"Bricolage Grotesque",w:700,sz:21});
|
||||
txt(BEX+22,BEY+114,"denoise · enhance",{w:700,sz:14,fill:SOFT});
|
||||
|
||||
// output (right)
|
||||
const OUTX=1010, OUTY=82, OUTW=250, OUTH=108;
|
||||
shadowRect(OUTX,OUTY,OUTW,OUTH,HI,INK,4);
|
||||
txt(OUTX+22,OUTY+42,"audio out",{f:"Bricolage Grotesque",w:800,sz:26});
|
||||
txt(OUTX+22,OUTY+72,"one cleaned",{w:700,sz:15,fill:SOFT});
|
||||
txt(OUTX+22,OUTY+92,"mono signal",{w:700,sz:15,fill:SOFT});
|
||||
|
||||
// arrows TOP
|
||||
arrow(INX+INW, audioY+INH/2, BEX, BEY+58, RUST);
|
||||
arrow(INX+INW, refY+INH/2, BEX, BEY+92, COLD, "2 8");
|
||||
arrow(BEX+BEW, BEY+BEH/2, OUTX, OUTY+OUTH/2, RUST);
|
||||
|
||||
// ============================================================
|
||||
// BOTTOM: streaming inset - the wire format
|
||||
// ============================================================
|
||||
const insY=290, insH=250;
|
||||
svg.appendChild(el("rect",{x:7+0,y:insY+7,width:1480,height:insH,fill:INK}));
|
||||
svg.appendChild(el("rect",{x:0,y:insY,width:1480,height:insH,fill:PAPER2,stroke:INK,"stroke-width":3.5}));
|
||||
txt(24,insY+34,"ON THE WIRE",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
txt(1456,insY+34,"interleaved stereo in → mono PCM out",{w:700,sz:15,a:"end",fill:RUSTD});
|
||||
|
||||
// --- INPUT: interleaved stereo frames ---
|
||||
const frameY=insY+70, frameH=46, sampW=58, gap=4;
|
||||
let fx=40;
|
||||
// channel legend (left)
|
||||
txt(fx,frameY-14,"INPUT · stereo PCM",{f:"Bricolage Grotesque",w:800,sz:18});
|
||||
// draw 8 interleaved samples: ch0 mic, ch1 ref, ...
|
||||
const seq=[0,1,0,1,0,1,0,1];
|
||||
seq.forEach((ch,i)=>{
|
||||
const x=fx+i*(sampW+gap);
|
||||
const fill = ch===0 ? "#E7C8C0" : "#C7D9DB";
|
||||
const stroke = ch===0 ? RUST : COLD;
|
||||
svg.appendChild(el("rect",{x:x+5,y:frameY+5,width:sampW,height:frameH,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y:frameY,width:sampW,height:frameH,fill,stroke,"stroke-width":3}));
|
||||
txt(x+sampW/2,frameY+30,ch===0?"L":"R",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:ch===0?RUST:COLD});
|
||||
});
|
||||
const seqW=seq.length*(sampW+gap)-gap;
|
||||
// channel mapping labels under the frame strip
|
||||
txt(fx, frameY+frameH+38, "L = channel 0",{f:"Bricolage Grotesque",w:800,sz:18,fill:RUST});
|
||||
txt(fx, frameY+frameH+62, "the mic / near signal",{w:700,sz:14,fill:SOFT});
|
||||
txt(fx+260, frameY+frameH+38, "R = channel 1",{f:"Bricolage Grotesque",w:800,sz:18,fill:COLD});
|
||||
txt(fx+260, frameY+frameH+62, "the reference / far signal",{w:700,sz:14,fill:SOFT});
|
||||
|
||||
// --- backend pill in the middle ---
|
||||
const pillX=fx+seqW+70, pillY=frameY-4, pillW=200, pillH=frameH+8;
|
||||
shadowRect(pillX,pillY,pillW,pillH,RUST,INK,3.5);
|
||||
txt(pillX+pillW/2,pillY+34,"backend",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:PAPER});
|
||||
|
||||
// arrow into pill, arrow out of pill
|
||||
arrow(fx+seqW+8, frameY+frameH/2, pillX, frameY+frameH/2, INK);
|
||||
|
||||
// --- OUTPUT: mono PCM strip ---
|
||||
const outFx=pillX+pillW+50;
|
||||
txt(outFx,frameY-12,"OUTPUT · mono PCM",{f:"Bricolage Grotesque",w:800,sz:18});
|
||||
const outN=8;
|
||||
for(let i=0;i<outN;i++){
|
||||
const x=outFx+i*(sampW+gap);
|
||||
svg.appendChild(el("rect",{x:x+5,y:frameY+5,width:sampW,height:frameH,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y:frameY,width:sampW,height:frameH,fill:HI,stroke:INK,"stroke-width":3}));
|
||||
txt(x+sampW/2,frameY+30,"M",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:INK});
|
||||
}
|
||||
const outW=outN*(sampW+gap)-gap;
|
||||
arrow(pillX+pillW, frameY+frameH/2, outFx, frameY+frameH/2, RUST);
|
||||
txt(outFx, frameY+frameH+38, "single channel",{f:"Bricolage Grotesque",w:800,sz:18});
|
||||
txt(outFx, frameY+frameH+62, "cleaned · enhanced result",{w:700,sz:14,fill:SOFT});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/audio-transform-io.png
vendored
Normal file
|
After Width: | Height: | Size: 226 KiB |
157
docs/static/images/diagrams/cloud-proxy-sequence.html
vendored
Normal file
@@ -0,0 +1,157 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Cloud Proxy</div>
|
||||
<h1>Local API, <em>cloud model</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">PII</div>
|
||||
<div class="s">filtered</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Proxy to a hosted model while <b>PII is redacted on the way out and on the way back.</b></div>
|
||||
<div class="url">localai.io<span>/features/cloud-proxy</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// left-pointing arrowhead variant for return path
|
||||
function arrowL(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2+11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2+11} ${y2} l ${a+4} -${a} M ${x2+11} ${y2} l ${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ===== node geometry =====
|
||||
const BW=232, BH=104;
|
||||
// row Y centers
|
||||
const rowReq=110; // request row top
|
||||
const rowRet=350; // return row top
|
||||
|
||||
// helper: titled box
|
||||
function nodeBox(x,y,w,h,fill,title,sub,opt){
|
||||
opt=opt||{};
|
||||
if(opt.dash) svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:opt.stroke||INK,"stroke-width":3.5,"stroke-dasharray":opt.dash}));
|
||||
else shadowRect(x,y,w,h,fill,opt.stroke,opt.sw);
|
||||
txt(x+w/2,y+ (sub?44:h/2+9) ,title,{f:"Bricolage Grotesque",w:800,sz:opt.tsz||23,a:"middle",fill:opt.tfill||INK});
|
||||
if(sub) txt(x+w/2,y+72,sub,{w:700,sz:14,a:"middle",fill:opt.sfill||SOFT});
|
||||
}
|
||||
|
||||
// ===== REQUEST ROW (left -> right) =====
|
||||
// 4 boxes: client, auth/routing, PII redact (rust), cloud-proxy backend
|
||||
const reqXs=[24, 296, 590, 884];
|
||||
nodeBox(reqXs[0],rowReq,BW,BH,PAPER2,"Client","app · curl · SDK");
|
||||
nodeBox(reqXs[1],rowReq,BW,BH,HI,"Auth / routing","middleware");
|
||||
// PII redaction - emphasis (rust)
|
||||
shadowRect(reqXs[2],rowReq,BW,BH,RUST,INK,3.5);
|
||||
txt(reqXs[2]+BW/2,rowReq+44,"PII redaction",{f:"Bricolage Grotesque",w:800,sz:23,a:"middle",fill:PAPER});
|
||||
txt(reqXs[2]+BW/2,rowReq+72,"request-side",{w:700,sz:14,a:"middle",fill:"#F1D9C8"});
|
||||
// cloud-proxy gRPC backend
|
||||
nodeBox(reqXs[3],rowReq,BW,BH,"#EFE0BF","cloud-proxy","gRPC backend");
|
||||
|
||||
// request row label
|
||||
txt(24,rowReq-22,"REQUEST",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
|
||||
// request arrows (rust = primary direction)
|
||||
arrow(reqXs[0]+BW,rowReq+BH/2,reqXs[1],rowReq+BH/2,INK);
|
||||
arrow(reqXs[1]+BW,rowReq+BH/2,reqXs[2],rowReq+BH/2,INK);
|
||||
arrow(reqXs[2]+BW,rowReq+BH/2,reqXs[3],rowReq+BH/2,RUSTD);
|
||||
|
||||
// ===== UPSTREAM (top right) =====
|
||||
const upX=1224, upY=rowReq, upW=232, upH=104;
|
||||
nodeBox(upX,upY,upW,upH,PAPER,"Upstream","OpenAI · Anthropic",{stroke:COLD,tfill:COLD,sfill:COLD});
|
||||
// gRPC / network link (dashed) from backend to upstream
|
||||
arrow(reqXs[3]+BW,rowReq+BH/2,upX,upY+upH/2,RUSTD,"2 8");
|
||||
txt((reqXs[3]+BW+upX)/2, rowReq+BH/2-14,"HTTPS",{w:700,sz:12,ls:".08em",a:"middle",fill:RUSTD});
|
||||
|
||||
// ===== RETURN ROW (right -> left) =====
|
||||
// upstream -> SSE stream -> streaming PII filter -> client
|
||||
const retXs=[1224, 884, 590]; // sse, filter ... client uses reqXs[0]
|
||||
// SSE stream node (under upstream)
|
||||
nodeBox(retXs[0],rowRet,upW,BH,PAPER,"SSE stream","tokens",{stroke:COLD,tfill:COLD,sfill:COLD});
|
||||
// streaming PII filter (rust emphasis)
|
||||
shadowRect(retXs[1],rowRet,BW,BH,RUST,INK,3.5);
|
||||
txt(retXs[1]+BW/2,rowRet+44,"PII filter",{f:"Bricolage Grotesque",w:800,sz:23,a:"middle",fill:PAPER});
|
||||
txt(retXs[1]+BW/2,rowRet+72,"streaming",{w:700,sz:14,a:"middle",fill:"#F1D9C8"});
|
||||
|
||||
// return row label
|
||||
txt(1456,rowRet-22,"RETURN",{w:700,sz:14,ls:".2em",a:"end",fill:SOFT});
|
||||
|
||||
// vertical link upstream(req) -> SSE(return)
|
||||
arrow(upX+upW/2,upY+upH,upX+upW/2,rowRet,COLD,"2 8");
|
||||
txt(upX+upW/2+14,(upY+upH+rowRet)/2+5,"stream",{w:700,sz:12,a:"start",fill:COLD});
|
||||
|
||||
// return arrows (right -> left, cold teal secondary direction, rust at filter)
|
||||
arrowL(retXs[0],rowRet+BH/2,retXs[1]+BW,rowRet+BH/2,COLD);
|
||||
arrowL(retXs[1],rowRet+BH/2,reqXs[0]+BW,rowRet+BH/2,RUSTD);
|
||||
|
||||
// client gets the cleaned response (re-label client zone on return)
|
||||
txt(reqXs[0]+BW/2,rowRet+BH/2+5,"to Client",{f:"Bricolage Grotesque",w:800,sz:20,a:"middle",fill:SOFT});
|
||||
|
||||
// ===== dimmed bypass note =====
|
||||
const nY=502;
|
||||
svg.appendChild(el("line",{x1:24,y1:nY-26,x2:1456,y2:nY-26,stroke:DIM,"stroke-width":2,"stroke-dasharray":"4 8"}));
|
||||
txt(24,nY+4,"BYPASSED",{w:700,sz:13,ls:".18em",fill:DIM});
|
||||
txt(180,nY+4,"templating · MCP · model-loader are bypassed in proxy mode",{w:600,sz:18,fill:DIM});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/cloud-proxy-sequence.png
vendored
Normal file
|
After Width: | Height: | Size: 202 KiB |
149
docs/static/images/diagrams/composable-core.html
vendored
Normal file
@@ -0,0 +1,149 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2;
|
||||
--paper2:#ECDFC2;
|
||||
--ink:#211C14;
|
||||
--ink-soft:#5A5142;
|
||||
--rust:#B43A2C;
|
||||
--rust-deep:#8F2C20;
|
||||
--cold:#3F6E73;
|
||||
--hi:#E7D6AE;
|
||||
--dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);
|
||||
color:var(--ink);
|
||||
font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:34px 60px 30px;display:flex;flex-direction:column}
|
||||
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:18px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:54px;line-height:.98;letter-spacing:-.015em;margin-top:8px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:11px 17px 9px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:12px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
|
||||
.stage{flex:1;margin-top:6px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:19px;color:var(--ink-soft);line-height:1.3;max-width:1050px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:23px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Architecture</div>
|
||||
<h1>One small core.<br>Backends you <em>plug in</em>.</h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">ONLY WHAT</div>
|
||||
<div class="s">you actually run</div>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
<div class="stage"><svg viewBox="0 0 1480 540" id="svg"></svg></div>
|
||||
|
||||
<footer>
|
||||
<div class="note">Run a model and the right engine is <b>pulled automatically</b>.<br>Each backend is its own image, optimized for one job. <b>Install nothing you don't use.</b></div>
|
||||
<div class="url">localai.io</div>
|
||||
</footer>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(tag, attrs, txt){
|
||||
const e = document.createElementNS("http://www.w3.org/2000/svg", tag);
|
||||
for(const k in attrs) e.setAttribute(k, attrs[k]);
|
||||
if(txt!=null) e.textContent = txt;
|
||||
return e;
|
||||
}
|
||||
const svg = document.getElementById("svg");
|
||||
|
||||
// ---- geometry ----
|
||||
const CORE = {x:560, y:150, w:360, h:200};
|
||||
const coreCx = CORE.x + CORE.w/2, coreCy = CORE.y + CORE.h/2;
|
||||
const TW=320, TH=92;
|
||||
const LX=40, RX=1120;
|
||||
const rows=[4, 136, 268, 400];
|
||||
|
||||
const left = [
|
||||
{n:"llama.cpp", s:"LLMs · GGUF"},
|
||||
{n:"vLLM", s:"high-throughput"},
|
||||
{n:"MLX", s:"Apple Silicon"},
|
||||
{n:"whisper.cpp", s:"speech to text"},
|
||||
];
|
||||
const right = [
|
||||
{n:"stable-diffusion", s:"image & video"},
|
||||
{n:"kokoro", s:"text to speech"},
|
||||
{n:"parakeet.cpp", s:"fast ASR"},
|
||||
{n:"+ 30 more", s:"in the gallery", more:true},
|
||||
];
|
||||
|
||||
// ---- connectors (drawn first, under cards) ----
|
||||
function socket(x,y){ svg.appendChild(el("rect",{x:x-6,y:y-6,width:12,height:12,fill:INK})); }
|
||||
function connector(x1,y1,x2,y2,more){
|
||||
svg.appendChild(el("line",{x1,y1,x2,y2,stroke:more?DIM:INK,"stroke-width":4,"stroke-dasharray":"6 5","stroke-linecap":"round"}));
|
||||
socket(x1,y1); socket(x2,y2);
|
||||
}
|
||||
left.forEach((t,i)=>{
|
||||
const ty=rows[i]+TH/2;
|
||||
connector(CORE.x, coreCy + (i-1.5)*42, LX+TW, ty, t.more);
|
||||
});
|
||||
right.forEach((t,i)=>{
|
||||
const ty=rows[i]+TH/2;
|
||||
connector(CORE.x+CORE.w, coreCy + (i-1.5)*42, RX, ty, t.more);
|
||||
});
|
||||
|
||||
// ---- backend tiles ----
|
||||
function tile(x,row,t){
|
||||
const y=row;
|
||||
if(!t.more) svg.appendChild(el("rect",{x:x+6,y:y+6,width:TW,height:TH,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:TW,height:TH,fill:t.more?PAPER:"#EFE0BF",
|
||||
stroke:t.more?DIM:INK,"stroke-width":3.5,"stroke-dasharray":t.more?"4 7":"none"}));
|
||||
svg.appendChild(el("text",{x:x+22,y:y+40,"font-family":"Bricolage Grotesque","font-weight":800,"font-size":26,fill:t.more?SOFT:INK},t.n));
|
||||
svg.appendChild(el("text",{x:x+22,y:y+66,"font-family":"Archivo","font-weight":700,"font-size":15,"letter-spacing":".02em",fill:t.more?DIM:SOFT},t.s));
|
||||
if(!t.more){
|
||||
const bw=134,bh=24,bx=x+TW-bw-16,by=y+TH-bh-14;
|
||||
svg.appendChild(el("rect",{x:bx,y:by,width:bw,height:bh,fill:PAPER,stroke:INK,"stroke-width":2}));
|
||||
svg.appendChild(el("text",{x:bx+bw/2,y:by+17,"text-anchor":"middle","font-family":"Archivo","font-weight":700,"font-size":11,"letter-spacing":".05em",fill:RUSTD},"SEPARATE IMAGE"));
|
||||
}
|
||||
}
|
||||
left.forEach((t,i)=> tile(LX, rows[i], t));
|
||||
right.forEach((t,i)=> tile(RX, rows[i], t));
|
||||
|
||||
// ---- core (drawn last, on top) ----
|
||||
svg.appendChild(el("rect",{x:CORE.x+9,y:CORE.y+9,width:CORE.w,height:CORE.h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x:CORE.x,y:CORE.y,width:CORE.w,height:CORE.h,fill:RUST,stroke:INK,"stroke-width":4}));
|
||||
svg.appendChild(el("text",{x:coreCx,y:CORE.y+52,"text-anchor":"middle","font-family":"Archivo","font-weight":700,"font-size":15,"letter-spacing":".22em",fill:HI},"THE CORE"));
|
||||
svg.appendChild(el("text",{x:coreCx,y:CORE.y+104,"text-anchor":"middle","font-family":"Bricolage Grotesque","font-weight":800,"font-size":48,fill:PAPER},"LocalAI"));
|
||||
svg.appendChild(el("text",{x:coreCx,y:CORE.y+140,"text-anchor":"middle","font-family":"Archivo","font-weight":700,"font-size":17,fill:"#F1D9C8"},"one API · routing · agents · gallery · WebUI"));
|
||||
const tagW=190,tagH=30,tagX=coreCx-tagW/2,tagY=CORE.y+CORE.h-40;
|
||||
svg.appendChild(el("rect",{x:tagX,y:tagY,width:tagW,height:tagH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
svg.appendChild(el("text",{x:coreCx,y:tagY+21,"text-anchor":"middle","font-family":"Bricolage Grotesque","font-weight":800,"font-size":16,"letter-spacing":".04em",fill:INK},"ONE SMALL BINARY"));
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/composable-core.png
vendored
Normal file
|
After Width: | Height: | Size: 304 KiB |
161
docs/static/images/diagrams/diarization-pipeline.html
vendored
Normal file
@@ -0,0 +1,161 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Diarization</div>
|
||||
<h1>Who spoke <em>when</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">RTTM</div>
|
||||
<div class="s">out</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Segment, embed, and cluster <b>-</b> or a single ASR pass <b>-</b> into speaker-labelled segments.</div>
|
||||
<div class="url">localai.io<span>/features/audio-diarization</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ============ INPUT: audio -> ffmpeg ============
|
||||
// audio
|
||||
shadowRect(20,238,150,84,PAPER2);
|
||||
txt(95,278,"audio",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(95,302,"wav · mp3 · m4a",{w:700,sz:12,a:"middle",fill:SOFT});
|
||||
|
||||
// ffmpeg
|
||||
shadowRect(212,238,178,84,HI);
|
||||
txt(301,275,"ffmpeg",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(301,300,"16 kHz · mono",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
arrow(170,280,212,280,INK);
|
||||
|
||||
// ============ PATH LABELS ============
|
||||
const PAX=470, PAW=560; // path A column band x range
|
||||
// Path A header band
|
||||
svg.appendChild(el("line",{x1:455,y1:96,x2:1100,y2:96,stroke:RUSTD,"stroke-width":2.5,"stroke-dasharray":"3 7"}));
|
||||
txt(462,86,"PATH A",{w:700,sz:13,ls:".2em",fill:RUSTD});
|
||||
txt(548,86,"sherpa-onnx · segment + embed + cluster",{w:700,sz:14,fill:SOFT});
|
||||
|
||||
svg.appendChild(el("line",{x1:455,y1:476,x2:1100,y2:476,stroke:COLD,"stroke-width":2.5,"stroke-dasharray":"3 7"}));
|
||||
txt(462,498,"PATH B",{w:700,sz:13,ls:".2em",fill:COLD});
|
||||
txt(548,498,"vibevoice · single ASR pass",{w:700,sz:14,fill:SOFT});
|
||||
|
||||
// ============ PATH A (top, rust) ============
|
||||
// four boxes in a row
|
||||
const aBoxW=148, aBoxH=92, aY=120;
|
||||
const aXs=[470,640,810,980];
|
||||
const aBoxes=[
|
||||
{n:"segment", s:"VAD windows"},
|
||||
{n:"embed", s:"speaker vec"},
|
||||
{n:"cluster", s:"group by ID"},
|
||||
{n:"labelled", s:"segments"},
|
||||
];
|
||||
aBoxes.forEach((b,i)=>{
|
||||
const x=aXs[i];
|
||||
shadowRect(x,aY,aBoxW,aBoxH,"#EFE0BF",RUST,3.5);
|
||||
txt(x+aBoxW/2,aY+44,b.n,{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:INK});
|
||||
txt(x+aBoxW/2,aY+68,b.s,{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
});
|
||||
// arrows between A boxes
|
||||
for(let i=0;i<aXs.length-1;i++){
|
||||
arrow(aXs[i]+aBoxW, aY+aBoxH/2, aXs[i+1], aY+aBoxH/2, RUST);
|
||||
}
|
||||
|
||||
// ============ PATH B (bottom, cold) ============
|
||||
const bBoxW=232, bBoxH=92, bY=372;
|
||||
const bXs=[520,820];
|
||||
const bBoxes=[
|
||||
{n:"single ASR pass", s:"transcribe + tag speakers"},
|
||||
{n:"segments + transcript", s:"text per speaker turn"},
|
||||
];
|
||||
bBoxes.forEach((b,i)=>{
|
||||
const x=bXs[i];
|
||||
shadowRect(x,bY,bBoxW,bBoxH,PAPER,COLD,3.5);
|
||||
txt(x+bBoxW/2,bY+44,b.n,{f:"Bricolage Grotesque",w:800,sz:21,a:"middle",fill:INK});
|
||||
txt(x+bBoxW/2,bY+70,b.s,{w:700,sz:13,a:"middle",fill:COLD});
|
||||
});
|
||||
arrow(bXs[0]+bBoxW, bY+bBoxH/2, bXs[1], bY+bBoxH/2, COLD);
|
||||
|
||||
// ============ ffmpeg -> branch into A and B ============
|
||||
// to Path A first box
|
||||
arrow(390,266,aXs[0],aY+aBoxH/2,RUST,"2 8");
|
||||
// to Path B first box
|
||||
arrow(390,294,bXs[0],bY+bBoxH/2,COLD,"2 8");
|
||||
|
||||
// ============ OUTPUT (right) ============
|
||||
const oX=1170, oW=270, oY=216, oH=128;
|
||||
shadowRect(oX,oY,oW,oH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:oX,y:oY,width:oW,height:50,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:oX,y1:oY+50,x2:oX+oW,y2:oY+50,stroke:INK,"stroke-width":4}));
|
||||
txt(oX+oW/2,oY+33,"output",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:PAPER});
|
||||
const fmts=["json","verbose_json","rttm"];
|
||||
fmts.forEach((f,i)=>{
|
||||
const fy=oY+66+i*20;
|
||||
txt(oX+24,fy+7,"·",{w:800,sz:18,fill:RUSTD});
|
||||
txt(oX+40,fy+7,f,{f:"Bricolage Grotesque",w:700,sz:18,fill:INK});
|
||||
});
|
||||
|
||||
// ============ converge A and B -> output ============
|
||||
arrow(aXs[3]+aBoxW, aY+aBoxH/2, oX, oY+oH*0.42, RUST);
|
||||
arrow(bXs[1]+bBoxW, bY+bBoxH/2, oX, oY+oH*0.74, COLD);
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/diarization-pipeline.png
vendored
Normal file
|
After Width: | Height: | Size: 216 KiB |
170
docs/static/images/diagrams/distributed-mode-arch.html
vendored
Normal file
@@ -0,0 +1,170 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Distributed Mode</div>
|
||||
<h1>One control plane, <em>many workers</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">SCALE</div>
|
||||
<div class="s">out</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Stateless frontends, a shared <b>NATS/Postgres</b> plane, and generic workers running per-model backends.</div>
|
||||
<div class="url">localai.io<span>/features/distributed-mode</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ===================== columns =====================
|
||||
// 1) LOAD BALANCER (far left)
|
||||
// 2) FRONTENDS (SmartRouter x N)
|
||||
// 3) STATE PLANE (center)
|
||||
// 4) WORKERS (x N)
|
||||
|
||||
// ---------- LOAD BALANCER ----------
|
||||
txt(20,30,"INGRESS",{w:700,sz:13,ls:".2em",fill:SOFT});
|
||||
const LBX=20, LBW=150, LBY=222, LBH=116;
|
||||
shadowRect(LBX,LBY,LBW,LBH,COLD,INK,3.5);
|
||||
txt(LBX+LBW/2,LBY+50,"Load",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:PAPER});
|
||||
txt(LBX+LBW/2,LBY+78,"balancer",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:PAPER});
|
||||
|
||||
// ---------- FRONTENDS ----------
|
||||
txt(238,30,"STATELESS FRONTENDS",{w:700,sz:13,ls:".2em",fill:SOFT});
|
||||
const FX=238, FW=232, FH=132, fY=[58,214,370];
|
||||
fY.forEach((y,i)=>{
|
||||
shadowRect(FX,y,FW,FH,PAPER2,INK,3.5);
|
||||
txt(FX+18,y+44,"SmartRouter",{f:"Bricolage Grotesque",w:800,sz:24});
|
||||
txt(FX+18,y+72,"frontend #"+(i+1),{w:700,sz:15,fill:SOFT});
|
||||
txt(FX+18,y+104,"routing · API · UI",{w:600,sz:14,fill:DIM});
|
||||
});
|
||||
|
||||
// ---------- STATE PLANE ----------
|
||||
txt(560,30,"SHARED STATE PLANE",{w:700,sz:13,ls:".2em",fill:SOFT});
|
||||
const SPX=560, SPW=300, SPY=46, SPH=470;
|
||||
shadowRect(SPX,SPY,SPW,SPH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:SPX,y:SPY,width:SPW,height:58,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:SPX,y1:SPY+58,x2:SPX+SPW,y2:SPY+58,stroke:INK,"stroke-width":4}));
|
||||
txt(SPX+22,SPY+38,"Control plane",{f:"Bricolage Grotesque",w:800,sz:26,fill:PAPER});
|
||||
// chips
|
||||
const chips=[
|
||||
{n:"PostgreSQL", s:"shared config & state"},
|
||||
{n:"NATS", s:"jobs · messaging bus"},
|
||||
{n:"S3 (optional)", s:"model & artifact store"},
|
||||
];
|
||||
const CHX=SPX+24, CHW=SPW-48, CHH=104; let cy=SPY+82;
|
||||
chips.forEach(c=>{
|
||||
svg.appendChild(el("rect",{x:CHX,y:cy,width:CHW,height:CHH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(CHX+18,cy+44,c.n,{f:"Bricolage Grotesque",w:800,sz:24});
|
||||
txt(CHX+18,cy+76,c.s,{w:700,sz:14,fill:SOFT});
|
||||
cy+=CHH+18;
|
||||
});
|
||||
|
||||
// ---------- WORKERS ----------
|
||||
txt(1460,30,"GENERIC WORKERS",{w:700,sz:13,ls:".2em",a:"end",fill:SOFT});
|
||||
const WX=950, WW=290, WH=132, wY=[58,214,370];
|
||||
const workerChips=[
|
||||
["llama.cpp","vLLM","whisper"],
|
||||
["llama.cpp","stable-diff"],
|
||||
["MLX","vLLM","embeddings"],
|
||||
];
|
||||
wY.forEach((y,i)=>{
|
||||
shadowRect(WX,y,WW,WH,"#EFE0BF",INK,3.5);
|
||||
txt(WX+18,y+40,"Worker #"+(i+1),{f:"Bricolage Grotesque",w:800,sz:24});
|
||||
txt(WX+WW-18,y+38,"per-model gRPC",{w:700,sz:12,ls:".04em",a:"end",fill:RUSTD});
|
||||
// process chips
|
||||
const procs=workerChips[i];
|
||||
const pw=(WW-36-(procs.length-1)*10)/procs.length, ph=46, px0=WX+18, py=y+WH-ph-16;
|
||||
procs.forEach((p,j)=>{
|
||||
const px=px0+j*(pw+10);
|
||||
svg.appendChild(el("rect",{x:px,y:py,width:pw,height:ph,fill:PAPER,stroke:INK,"stroke-width":2}));
|
||||
txt(px+pw/2,py+ph/2-2,p,{f:"Bricolage Grotesque",w:700,sz:13.5,a:"middle"});
|
||||
txt(px+pw/2,py+ph/2+14,"gRPC",{w:600,sz:10,ls:".06em",a:"middle",fill:DIM});
|
||||
});
|
||||
});
|
||||
|
||||
// ===================== ARROWS =====================
|
||||
// LB -> each frontend
|
||||
fY.forEach((y)=> arrow(LBX+LBW, LBY+LBH/2, FX, y+FH/2, INK));
|
||||
// frontends -> state plane (solid, control). Target the plane left edge near matching height,
|
||||
// clamped inside the plane body so arrowheads never land on the title bar or below the box.
|
||||
fY.forEach((y)=>{
|
||||
const ty=Math.max(SPY+90, Math.min(SPY+SPH-30, y+FH/2));
|
||||
arrow(FX+FW, y+FH/2, SPX, ty, INK);
|
||||
});
|
||||
|
||||
// NATS messaging bus -> workers (dashed). Workers coordinate via NATS;
|
||||
// PostgreSQL is the frontends' shared state, not something workers connect to.
|
||||
const natsY = SPY+82+CHH+18 + CHH/2; // NATS chip center y
|
||||
wY.forEach((y)=> arrow(SPX+SPW, natsY, WX, y+WH/2, RUSTD, "2 8"));
|
||||
// label the NATS bus arrows
|
||||
const labW=140, labH=26, labX=(SPX+SPW+WX)/2-labW/2, labY=natsY-46;
|
||||
svg.appendChild(el("rect",{x:labX,y:labY,width:labW,height:labH,fill:PAPER,stroke:RUSTD,"stroke-width":2}));
|
||||
txt(labX+labW/2,labY+18,"backend.install",{f:"Bricolage Grotesque",w:700,sz:14,a:"middle",fill:RUSTD});
|
||||
|
||||
// ---- annotated arrow: frontend -> worker : LoadModel (gRPC) ----
|
||||
arrow(FX+FW, fY[2]+FH-24, WX, wY[2]+WH-30, COLD, "4 7");
|
||||
const lab2W=176, lab2H=26, lab2X=(FX+FW+WX)/2-lab2W/2 - 40, lab2Y=fY[2]+FH+24;
|
||||
svg.appendChild(el("rect",{x:lab2X,y:lab2Y,width:lab2W,height:lab2H,fill:PAPER,stroke:COLD,"stroke-width":2}));
|
||||
txt(lab2X+lab2W/2,lab2Y+18,"LoadModel (gRPC)",{f:"Bricolage Grotesque",w:700,sz:14,a:"middle",fill:COLD});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/distributed-mode-arch.png
vendored
Normal file
|
After Width: | Height: | Size: 321 KiB |
164
docs/static/images/diagrams/ds4-layer-split.html
vendored
Normal file
@@ -0,0 +1,164 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> ds4 layer split</div>
|
||||
<h1>Workers dial <em>in</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">LAYER</div>
|
||||
<div class="s">split</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">ds4 workers connect to the coordinator <b>(llama.cpp RPC dials the other direction).</b></div>
|
||||
<div class="url">localai.io<span>/features/distributed-mode</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ===================== PANELS =====================
|
||||
// LEFT panel: ds4 (rust) - workers dial IN to coordinator
|
||||
// RIGHT panel: llama.cpp RPC (cold teal) - main dials OUT to rpc-servers
|
||||
|
||||
// ---- panel frames + headers ----
|
||||
const PY=24, PH=448, PW=692;
|
||||
const LPX=18, RPX=770;
|
||||
|
||||
// LEFT panel frame
|
||||
svg.appendChild(el("rect",{x:LPX,y:PY,width:PW,height:PH,fill:"none",stroke:RUSTD,"stroke-width":2,"stroke-dasharray":"2 7"}));
|
||||
// LEFT header bar
|
||||
shadowRect(LPX+16,PY+22,300,44,RUST,INK,3.5);
|
||||
txt(LPX+34,PY+52,"ds4",{f:"Bricolage Grotesque",w:800,sz:26,fill:PAPER});
|
||||
txt(LPX+316-18,PY+50,"layer split",{w:700,sz:13,ls:".1em",a:"end",fill:"#F1D9C8"});
|
||||
|
||||
// RIGHT panel frame
|
||||
svg.appendChild(el("rect",{x:RPX,y:PY,width:PW,height:PH,fill:"none",stroke:COLD,"stroke-width":2,"stroke-dasharray":"2 7"}));
|
||||
// RIGHT header bar
|
||||
shadowRect(RPX+16,PY+22,372,44,COLD,INK,3.5);
|
||||
txt(RPX+34,PY+52,"llama.cpp RPC",{f:"Bricolage Grotesque",w:800,sz:26,fill:PAPER});
|
||||
txt(RPX+388-18,PY+50,"distributed",{w:700,sz:13,ls:".1em",a:"end",fill:"#DCEBEC"});
|
||||
|
||||
// ============== LEFT: workers -> coordinator ==============
|
||||
// coordinator centered vertically in panel
|
||||
const coW=246, coH=104;
|
||||
const coX=LPX+PW-coW-40, coY=PY+172;
|
||||
shadowRect(coX,coY,coW,coH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:coX,y:coY,width:coW,height:36,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:coX,y1:coY+36,x2:coX+coW,y2:coY+36,stroke:INK,"stroke-width":3}));
|
||||
txt(coX+coW/2,coY+26,"coordinator",{f:"Bricolage Grotesque",w:800,sz:21,a:"middle",fill:PAPER});
|
||||
txt(coX+coW/2,coY+62,"merges slices",{w:700,sz:15,a:"middle"});
|
||||
txt(coX+coW/2,coY+86,"serves the API",{w:600,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// two worker boxes on the left of panel
|
||||
const wW=232, wH=92;
|
||||
const wX=LPX+40;
|
||||
const wY=[PY+96, PY+264];
|
||||
const wData=[
|
||||
{n:"worker A",s:"layers 0:19"},
|
||||
{n:"worker B",s:"layers 20:output"},
|
||||
];
|
||||
wData.forEach((d,i)=>{
|
||||
shadowRect(wX,wY[i],wW,wH,"#EFE0BF",INK,3.5);
|
||||
txt(wX+18,wY[i]+38,d.n,{f:"Bricolage Grotesque",w:800,sz:22});
|
||||
txt(wX+18,wY[i]+66,d.s,{w:700,sz:16,fill:RUSTD});
|
||||
});
|
||||
|
||||
// arrows: workers dial IN toward coordinator (arrowhead at coordinator)
|
||||
arrow(wX+wW, wY[0]+wH/2, coX, coY+38, RUSTD, "none");
|
||||
arrow(wX+wW, wY[1]+wH/2, coX, coY+coH-22, RUSTD, "none");
|
||||
|
||||
// caption under left arrows
|
||||
txt(LPX+PW/2, PY+PH-22, "activations flow through the slices", {w:600,sz:14,a:"middle",fill:SOFT,ls:".02em"});
|
||||
|
||||
// ============== RIGHT: main -> rpc-servers ==============
|
||||
// main server on the LEFT of right panel
|
||||
const msW=246, msH=104;
|
||||
const msX=RPX+40, msY=PY+172;
|
||||
shadowRect(msX,msY,msW,msH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:msX,y:msY,width:msW,height:36,fill:COLD}));
|
||||
svg.appendChild(el("line",{x1:msX,y1:msY+36,x2:msX+msW,y2:msY+36,stroke:INK,"stroke-width":3}));
|
||||
txt(msX+msW/2,msY+26,"main server",{f:"Bricolage Grotesque",w:800,sz:21,a:"middle",fill:PAPER});
|
||||
txt(msX+msW/2,msY+62,"holds the model",{w:700,sz:15,a:"middle"});
|
||||
txt(msX+msW/2,msY+86,"offloads layers",{w:600,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// two rpc-server boxes on the RIGHT of right panel
|
||||
const rW=232, rH=92;
|
||||
const rX=RPX+PW-rW-40;
|
||||
const rY=[PY+96, PY+264];
|
||||
const rData=[
|
||||
{n:"rpc-server",s:"remote GPU/CPU"},
|
||||
{n:"rpc-server",s:"remote GPU/CPU"},
|
||||
];
|
||||
rData.forEach((d,i)=>{
|
||||
shadowRect(rX,rY[i],rW,rH,"#DCE7E7",INK,3.5);
|
||||
txt(rX+18,rY[i]+38,d.n,{f:"Bricolage Grotesque",w:800,sz:22});
|
||||
txt(rX+18,rY[i]+66,d.s,{w:700,sz:16,fill:COLD});
|
||||
});
|
||||
|
||||
// arrows: main dials OUT toward rpc-servers (arrowhead at rpc-servers)
|
||||
arrow(msX+msW, msY+38, rX, rY[0]+rH/2, COLD, "2 8");
|
||||
arrow(msX+msW, msY+msH-22, rX, rY[1]+rH/2, COLD, "2 8");
|
||||
|
||||
// caption under right arrows
|
||||
txt(RPX+PW/2, PY+PH-22, "main opens connections to workers", {w:600,sz:14,a:"middle",fill:SOFT,ls:".02em"});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/ds4-layer-split.png
vendored
Normal file
|
After Width: | Height: | Size: 233 KiB |
160
docs/static/images/diagrams/face-recognition-flow.html
vendored
Normal file
@@ -0,0 +1,160 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Face Recognition</div>
|
||||
<h1>Identify, with a <em>liveness gate</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">1:N</div>
|
||||
<div class="s">+ live</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">1:N match against a vector store; <b>anti-spoofing can veto a verification.</b></div>
|
||||
<div class="url">localai.io<span>/features/face-recognition</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// vertical arrow (top -> bottom)
|
||||
function arrowV(x1,y1,x2,y2,color,dash){
|
||||
const my=(y1+y2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${x1} ${my}, ${x2} ${my}, ${x2} ${y2-11}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2} ${y2-11} l -${a} -${a+4} M ${x2} ${y2-11} l ${a} -${a+4}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// small reusable "node" box helper
|
||||
function node(x,y,w,h,title,sub,fill,strokeCol,sw,dash){
|
||||
if(dash){ svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:strokeCol||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash})); }
|
||||
else { shadowRect(x,y,w,h,fill,strokeCol,sw); }
|
||||
if(sub){
|
||||
txt(x+w/2,y+h/2-2,title,{f:"Bricolage Grotesque",w:800,sz:21,a:"middle"});
|
||||
txt(x+w/2,y+h/2+19,sub,{w:700,sz:12.5,a:"middle",fill:SOFT});
|
||||
} else {
|
||||
txt(x+w/2,y+h/2+7,title,{f:"Bricolage Grotesque",w:800,sz:21,a:"middle"});
|
||||
}
|
||||
}
|
||||
|
||||
// ============ VECTOR STORE (shared, top-right anchor) ============
|
||||
const VSx=1110, VSy=40, VSw=320, VSh=120;
|
||||
shadowRect(VSx,VSy,VSw,VSh,HI,INK,4);
|
||||
txt(VSx+22,VSy+44,"Vector store",{f:"Bricolage Grotesque",w:800,sz:27});
|
||||
txt(VSx+22,VSy+74,"face embeddings · index",{w:700,sz:14,fill:SOFT});
|
||||
txt(VSx+22,VSy+98,"shared by register & identify",{w:700,sz:13,fill:RUSTD});
|
||||
|
||||
// ============ REGISTER lane (top) ============
|
||||
txt(20,42,"REGISTER",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
const rY=70, rH=66;
|
||||
node(24, rY, 200, rH, "Image", "enroll photo", PAPER2);
|
||||
node(304, rY, 200, rH, "Face embedding", "vectorize", PAPER2);
|
||||
// arrows register
|
||||
arrow(24+200, rY+rH/2, 304, rY+rH/2, COLD);
|
||||
// embedding -> store (up-right into vector store)
|
||||
arrow(304+200, rY+rH/2, VSx, VSy+34, COLD);
|
||||
txt(560, rY-2, "store", {w:700,sz:13,a:"middle",fill:COLD});
|
||||
|
||||
// ============ IDENTIFY lane (middle) ============
|
||||
txt(20,232,"IDENTIFY",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
const iY=258, iH=66;
|
||||
node(24, iY, 178, iH, "Probe image", "query face", PAPER2);
|
||||
node(254, iY, 168, iH, "Embedding", "vectorize", PAPER2);
|
||||
node(474, iY, 200, iH, "Top-K cosine", "search store", PAPER2);
|
||||
node(726, iY, 168, iH, "Match", "best candidate", "#EFE0BF");
|
||||
// arrows identify chain
|
||||
arrow(24+178, iY+iH/2, 254, iY+iH/2, INK);
|
||||
arrow(254+168, iY+iH/2, 474, iY+iH/2, INK);
|
||||
arrow(474+200, iY+iH/2, 726, iY+iH/2, INK);
|
||||
// store -> top-K cosine (dashed lookup, from vector store down)
|
||||
arrowV(VSx+VSw/2, VSy+VSh, 574, iY, RUSTD, "3 8");
|
||||
txt(VSx+VSw/2+12, VSy+VSh+34, "lookup", {w:700,sz:13,a:"middle",fill:RUSTD});
|
||||
|
||||
// ============ VERIFY (bottom, highlight) ============
|
||||
txt(20,432,"VERIFY",{w:700,sz:14,ls:".2em",fill:RUSTD});
|
||||
|
||||
// liveness / anti-spoof box (left-lower)
|
||||
const lvX=24, lvY=452, lvW=240, lvH=78;
|
||||
shadowRect(lvX,lvY,lvW,lvH,PAPER,RUST,4);
|
||||
txt(lvX+lvW/2,lvY+34,"Liveness / anti-spoof",{f:"Bricolage Grotesque",w:800,sz:19,a:"middle",fill:RUST});
|
||||
txt(lvX+lvW/2,lvY+58,"can VETO the match",{w:700,sz:13,a:"middle",fill:RUSTD});
|
||||
|
||||
// AND gate
|
||||
const agX=560, agY=458, agW=160, agH=66;
|
||||
shadowRect(agX,agY,agW,agH,RUST,INK,4);
|
||||
txt(agX+agW/2,agY+30,"AND gate",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:PAPER});
|
||||
txt(agX+agW/2,agY+52,"both must pass",{w:700,sz:12.5,a:"middle",fill:"#F1D9C8"});
|
||||
|
||||
// verified box
|
||||
const vbX=820, vbY=452, vbW=200, vbH=78;
|
||||
shadowRect(vbX,vbY,vbW,vbH,HI,INK,4);
|
||||
txt(vbX+vbW/2,vbY+34,"Verified",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(vbX+vbW/2,vbY+58,"identity confirmed",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// match result -> AND gate (from identify Match box, down)
|
||||
arrowV(726+168/2, iY+iH, agX+50, agY, INK);
|
||||
txt(726+168/2+86, iY+iH+44, "match result", {w:700,sz:13,a:"middle",fill:INK});
|
||||
// liveness -> AND gate (gating input, rust)
|
||||
arrow(lvX+lvW, lvY+lvH/2, agX, agY+agH/2, RUST);
|
||||
// AND gate -> verified
|
||||
arrow(agX+agW, agY+agH/2, vbX, vbY+vbH/2, RUST);
|
||||
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/face-recognition-flow.png
vendored
Normal file
|
After Width: | Height: | Size: 250 KiB |
194
docs/static/images/diagrams/federated-vs-worker.html
vendored
Normal file
@@ -0,0 +1,194 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Distributed</div>
|
||||
<h1>Federated vs <em>worker</em> mode</h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">TWO</div>
|
||||
<div class="s">modes</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Federated routes whole requests to one node; worker shards one model across machines.</div>
|
||||
<div class="url">localai.io<span>/features/distributed_inferencing</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", COLDD="#2E5256", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// vertical arrow (top->down)
|
||||
function vArrow(x1,y1,x2,y2,color,dash){
|
||||
const my=(y1+y2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${x1} ${my}, ${x2} ${my}, ${x2} ${y2-11}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2} ${y2-11} l -${a} -${a+4} M ${x2} ${y2-11} l ${a} -${a+4}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// tag pill
|
||||
function pill(x,y,w,s,col){
|
||||
const h=28;
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill:PAPER,stroke:col,"stroke-width":2.5}));
|
||||
txt(x+w/2,y+19,s,{w:700,sz:12,ls:".1em",a:"middle",fill:col});
|
||||
}
|
||||
|
||||
// ============= LEFT PANEL : FEDERATED (cold teal) =============
|
||||
const LX=20, LW=680, PY=8, PH=544;
|
||||
svg.appendChild(el("rect",{x:LX,y:PY,width:LW,height:PH,fill:"none",stroke:COLD,"stroke-width":2.5,"stroke-dasharray":"2 8"}));
|
||||
txt(LX+18,PY+34,"FEDERATED",{f:"Bricolage Grotesque",w:800,sz:26,fill:COLDD});
|
||||
pill(LX+LW-176,PY+12,158,"WHOLE REQUEST",COLDD);
|
||||
|
||||
// request in
|
||||
const LCEN=LX+LW/2;
|
||||
shadowRect(LCEN-95,PY+58,190,52,PAPER2);
|
||||
txt(LCEN,PY+91,"Request",{f:"Bricolage Grotesque",w:700,sz:22,a:"middle"});
|
||||
|
||||
// load balancer
|
||||
const LBY=PY+150;
|
||||
shadowRect(LCEN-130,LBY,260,60,HI,COLDD,3.5);
|
||||
txt(LCEN,LBY+38,"Load balancer",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:INK});
|
||||
|
||||
// three nodes
|
||||
const fNodeY=PY+300, fNW=180, fNH=150;
|
||||
const fCols=[LX+40, LX+LW/2-fNW/2, LX+LW-fNW-40];
|
||||
const fNodes=[{busy:false},{busy:true},{busy:false}];
|
||||
fNodes.forEach((n,i)=>{
|
||||
const cx=fCols[i], cc=cx+fNW/2;
|
||||
if(n.busy){
|
||||
shadowRect(cx,fNodeY,fNW,fNH,PAPER,COLDD,4);
|
||||
// header bar teal
|
||||
svg.appendChild(el("rect",{x:cx,y:fNodeY,width:fNW,height:40,fill:COLD}));
|
||||
svg.appendChild(el("line",{x1:cx,y1:fNodeY+40,x2:cx+fNW,y2:fNodeY+40,stroke:INK,"stroke-width":3}));
|
||||
txt(cc,fNodeY+27,"Node "+(i+1),{f:"Bricolage Grotesque",w:800,sz:21,a:"middle",fill:PAPER});
|
||||
} else {
|
||||
svg.appendChild(el("rect",{x:cx,y:fNodeY,width:fNW,height:fNH,fill:PAPER2,stroke:DIM,"stroke-width":3.5,"stroke-dasharray":"4 7"}));
|
||||
svg.appendChild(el("rect",{x:cx,y:fNodeY,width:fNW,height:40,fill:"none"}));
|
||||
txt(cc,fNodeY+27,"Node "+(i+1),{f:"Bricolage Grotesque",w:800,sz:21,a:"middle",fill:SOFT});
|
||||
svg.appendChild(el("line",{x1:cx,y1:fNodeY+40,x2:cx+fNW,y2:fNodeY+40,stroke:DIM,"stroke-width":2,"stroke-dasharray":"4 6"}));
|
||||
}
|
||||
// full model block
|
||||
const my=fNodeY+58;
|
||||
svg.appendChild(el("rect",{x:cx+22,y:my,width:fNW-44,height:62,fill:n.busy?HI:PAPER,stroke:n.busy?INK:DIM,"stroke-width":2.5}));
|
||||
txt(cc,my+27,"FULL",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",fill:n.busy?INK:DIM});
|
||||
txt(cc,my+49,"model",{w:700,sz:14,a:"middle",fill:n.busy?SOFT:DIM});
|
||||
if(!n.busy) txt(cc,fNodeY+fNH+22,"idle",{w:700,sz:13,a:"middle",ls:".12em",fill:DIM});
|
||||
else txt(cc,fNodeY+fNH+22,"serves the request",{w:700,sz:13,a:"middle",ls:".04em",fill:COLDD});
|
||||
});
|
||||
|
||||
// arrows: request -> LB
|
||||
vArrow(LCEN,PY+110,LCEN,LBY,INK);
|
||||
// LB -> nodes (chosen one solid teal, others dashed dim)
|
||||
fNodes.forEach((n,i)=>{
|
||||
const cc=fCols[i]+fNW/2;
|
||||
vArrow(LCEN,LBY+60,cc,fNodeY, n.busy?COLDD:DIM, n.busy?"none":"2 8");
|
||||
});
|
||||
// caption
|
||||
txt(LCEN,PH-6,"whole request → one node, full model",{f:"Bricolage Grotesque",w:700,sz:18,a:"middle",fill:COLDD});
|
||||
|
||||
// ============= divider =============
|
||||
svg.appendChild(el("line",{x1:740,y1:PY+8,x2:740,y2:PH-8,stroke:INK,"stroke-width":2.5,"stroke-dasharray":"3 9"}));
|
||||
|
||||
// ============= RIGHT PANEL : WORKER (rust) =============
|
||||
const RX=760, RW=700;
|
||||
svg.appendChild(el("rect",{x:RX,y:PY,width:RW,height:PH,fill:"none",stroke:RUST,"stroke-width":2.5,"stroke-dasharray":"2 8"}));
|
||||
txt(RX+18,PY+34,"WORKER",{f:"Bricolage Grotesque",w:800,sz:26,fill:RUSTD});
|
||||
pill(RX+RW-176,PY+12,158,"SPLIT REQUEST",RUSTD);
|
||||
|
||||
const RCEN=RX+RW/2;
|
||||
// request in
|
||||
shadowRect(RCEN-95,PY+58,190,52,PAPER2);
|
||||
txt(RCEN,PY+91,"Request",{f:"Bricolage Grotesque",w:700,sz:22,a:"middle"});
|
||||
|
||||
// three worker nodes holding shards proportional to memory
|
||||
const wNodeY=PY+200;
|
||||
// memory-proportional widths
|
||||
const wShards=[{lbl:"shard 1",frac:"40%",h:170,mem:"16 GB"},{lbl:"shard 2",frac:"35%",h:150,mem:"12 GB"},{lbl:"shard 3",frac:"25%",h:120,mem:"8 GB"}];
|
||||
const wNW=190, wGap=20;
|
||||
const wTotW=wNW*3+wGap*2;
|
||||
const wStartX=RCEN-wTotW/2;
|
||||
const wCols=[wStartX, wStartX+wNW+wGap, wStartX+2*(wNW+wGap)];
|
||||
const wMaxH=170, wBaseY=wNodeY+wMaxH; // shards bottom-aligned
|
||||
|
||||
wShards.forEach((sd,i)=>{
|
||||
const cx=wCols[i], cc=cx+wNW/2;
|
||||
const top=wBaseY-sd.h;
|
||||
shadowRect(cx,top,wNW,sd.h,PAPER,RUSTD,3.5);
|
||||
// rust header
|
||||
svg.appendChild(el("rect",{x:cx,y:top,width:wNW,height:38,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:cx,y1:top+38,x2:cx+wNW,y2:top+38,stroke:INK,"stroke-width":3}));
|
||||
txt(cc,top+26,"Node "+(i+1),{f:"Bricolage Grotesque",w:800,sz:20,a:"middle",fill:PAPER});
|
||||
// shard fill
|
||||
svg.appendChild(el("rect",{x:cx+18,y:top+52,width:wNW-36,height:sd.h-72,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(cc,top+52+(sd.h-72)/2-4,sd.lbl,{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",fill:INK});
|
||||
txt(cc,top+52+(sd.h-72)/2+18,"weights "+sd.frac,{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
// memory tag under node
|
||||
txt(cc,wBaseY+24,sd.mem+" mem",{w:700,sz:13,a:"middle",ls:".04em",fill:RUSTD});
|
||||
});
|
||||
|
||||
// request is split across the whole sharded fleet (all nodes active)
|
||||
wShards.forEach((sd,i)=>{
|
||||
const cc=wCols[i]+wNW/2;
|
||||
const top=wBaseY-sd.h;
|
||||
vArrow(RCEN,PY+110,cc,top,RUSTD);
|
||||
});
|
||||
|
||||
// caption
|
||||
txt(RCEN,PH-6,"weights sharded across all nodes",{f:"Bricolage Grotesque",w:700,sz:18,a:"middle",fill:RUSTD});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/federated-vs-worker.png
vendored
Normal file
|
After Width: | Height: | Size: 272 KiB |
158
docs/static/images/diagrams/finetune-job-lifecycle.html
vendored
Normal file
@@ -0,0 +1,158 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Fine-tuning jobs</div>
|
||||
<h1>The fine-tune <em>job lifecycle</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">SSE</div>
|
||||
<div class="s">progress</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Create, train with live SSE progress, then export to <b>LoRA, merged, or GGUF.</b></div>
|
||||
<div class="url">localai.io<span>/features/fine-tuning</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- SWIMLANE LABELS (left) ----------
|
||||
const LANEX=18;
|
||||
const lanes=[
|
||||
{n:"React UI", y:60},
|
||||
{n:"REST + SSE", y:165},
|
||||
{n:"Go service", y:300},
|
||||
{n:"gRPC backend", y:440},
|
||||
];
|
||||
lanes.forEach(l=>{
|
||||
txt(LANEX,l.y,l.n,{w:700,sz:13,ls:".14em",fill:SOFT});
|
||||
});
|
||||
// thin lane separators
|
||||
[120,250,390].forEach(y=>{
|
||||
svg.appendChild(el("line",{x1:LANEX,y1:y,x2:1462,y2:y,stroke:DIM,"stroke-width":1.5,"stroke-dasharray":"2 9"}));
|
||||
});
|
||||
|
||||
// ---------- MAIN PIPELINE BOXES ----------
|
||||
const PY=190, PH=92;
|
||||
const steps=[
|
||||
{x:40, w:230, t:"create job", s:"POST /v1/fine_tuning", accent:false},
|
||||
{x:330, w:300, t:"train", s:"emits SSE progress / loss", accent:true},
|
||||
{x:690, w:230, t:"checkpoints", s:"saved during run", accent:false},
|
||||
];
|
||||
steps.forEach(st=>{
|
||||
shadowRect(st.x,PY,st.w,PH,st.accent?RUST:PAPER2,INK,4);
|
||||
txt(st.x+st.w/2,PY+44,st.t,{f:"Bricolage Grotesque",w:800,sz:30,a:"middle",fill:st.accent?PAPER:INK});
|
||||
txt(st.x+st.w/2,PY+72,st.s,{w:700,sz:14,a:"middle",fill:st.accent?"#F1D9C8":SOFT});
|
||||
});
|
||||
|
||||
// export node
|
||||
const EX=980, EW=200, EH=92;
|
||||
shadowRect(EX,PY,EW,EH,HI,INK,4);
|
||||
txt(EX+EW/2,PY+44,"export",{f:"Bricolage Grotesque",w:800,sz:30,a:"middle",fill:INK});
|
||||
txt(EX+EW/2,PY+72,"pick a format",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- PIPELINE ARROWS ----------
|
||||
arrow(steps[0].x+steps[0].w, PY+PH/2, steps[1].x, PY+PH/2, INK);
|
||||
arrow(steps[1].x+steps[1].w, PY+PH/2, steps[2].x, PY+PH/2, INK);
|
||||
arrow(steps[2].x+steps[2].w, PY+PH/2, EX, PY+PH/2, INK);
|
||||
|
||||
// SSE feedback loop: train -> back up to REST+SSE lane (dashed cold)
|
||||
const tCx=steps[1].x+steps[1].w/2;
|
||||
svg.appendChild(el("path",{d:`M ${tCx} ${PY} C ${tCx} ${PY-70}, ${tCx} ${PY-70}, ${tCx-180} ${PY-70} L ${tCx-300} ${PY-70}`,fill:"none",stroke:COLD,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":"2 8"}));
|
||||
{const ax=tCx-300, ay=PY-70, a=7;
|
||||
svg.appendChild(el("path",{d:`M ${ax} ${ay} l ${a+4} -${a} M ${ax} ${ay} l ${a+4} ${a}`,fill:"none",stroke:COLD,"stroke-width":3.5,"stroke-linecap":"round"}));}
|
||||
// SSE event chip
|
||||
const sseW=200, sseH=42, sseX=tCx-300-sseW, sseY=PY-70-sseH/2;
|
||||
svg.appendChild(el("rect",{x:sseX,y:sseY,width:sseW,height:sseH,fill:PAPER,stroke:COLD,"stroke-width":2.5}));
|
||||
txt(sseX+sseW/2,sseY+19,"event: progress",{f:"Bricolage Grotesque",w:800,sz:16,a:"middle",fill:COLD});
|
||||
txt(sseX+sseW/2,sseY+35,"step · loss · status",{w:700,sz:11,a:"middle",ls:".04em",fill:SOFT});
|
||||
|
||||
// ---------- FORMAT CHIPS (fan-out, right) ----------
|
||||
const chips=[
|
||||
{t:"lora", s:"adapter only"},
|
||||
{t:"merged_16bit",s:"full fp16"},
|
||||
{t:"merged_4bit", s:"quantized"},
|
||||
{t:"gguf", s:"llama.cpp ready", accent:true},
|
||||
];
|
||||
const chW=240, chH=70, chX=1210, chGap=18;
|
||||
const totalH=chips.length*chH+(chips.length-1)*chGap;
|
||||
let chY=PY+PH/2-totalH/2;
|
||||
const ys=[];
|
||||
chips.forEach((c,i)=>{
|
||||
const y=chY+i*(chH+chGap);
|
||||
ys.push(y);
|
||||
if(c.accent){
|
||||
shadowRect(chX,y,chW,chH,RUST,INK,3.5);
|
||||
}else{
|
||||
shadowRect(chX,y,chW,chH,"#EFE0BF",INK,3.5);
|
||||
}
|
||||
txt(chX+20,y+34,c.t,{f:"Bricolage Grotesque",w:800,sz:24,fill:c.accent?PAPER:INK});
|
||||
txt(chX+20,y+56,c.s,{w:700,sz:13,fill:c.accent?"#F1D9C8":SOFT});
|
||||
});
|
||||
|
||||
// arrows: export -> each chip
|
||||
const exRX=EX+EW, exMidY=PY+PH/2;
|
||||
chips.forEach((c,i)=>{
|
||||
arrow(exRX, exMidY, chX, ys[i]+chH/2, c.accent?RUSTD:SOFT);
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/finetune-job-lifecycle.png
vendored
Normal file
|
After Width: | Height: | Size: 222 KiB |
144
docs/static/images/diagrams/finetune-recipe.html
vendored
Normal file
@@ -0,0 +1,144 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Fine-tuning</div>
|
||||
<h1>Train, merge, <em>deploy</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">LoRA</div>
|
||||
<div class="s">to GGUF</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">From dataset to a servable GGUF, via LoRA fine-tune and merge.</div>
|
||||
<div class="url">localai.io<span>/advanced/fine-tuning</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- PIPELINE LAYOUT ----------
|
||||
// Six steps laid out in two rows of three, snaking left->right then right->left.
|
||||
// Each card: number badge, title, subtitle.
|
||||
const steps=[
|
||||
{n:"01", t:"Dataset", s:"JSONL prompts", tool:"your data", col:COLD},
|
||||
{n:"02", t:"Env & deps", s:"axolotl · CUDA", tool:"pip install", col:COLD},
|
||||
{n:"03", t:"Fine-tune", s:"LoRA adapter", tool:"axolotl", col:RUST},
|
||||
{n:"04", t:"Merge LoRA", s:"into base weights", tool:"peft merge", col:RUST},
|
||||
{n:"05", t:"Convert", s:"to GGUF + quantize", tool:"llama.cpp", col:RUST},
|
||||
{n:"06", t:"Load in LocalAI", s:"served via API", tool:"servable", col:RUST},
|
||||
];
|
||||
|
||||
const CW=410, CH=190;
|
||||
// row Y positions
|
||||
const rowY=[60, 320];
|
||||
// column X positions (3 columns) for top row left->right
|
||||
const colX=[40, 535, 1030];
|
||||
|
||||
// map step index -> {x,y, row}
|
||||
function slot(i){
|
||||
const row=Math.floor(i/3);
|
||||
let c=i%3;
|
||||
if(row===1) c=2-c; // snake: bottom row goes right->left
|
||||
return {x:colX[c], y:rowY[row], row, c};
|
||||
}
|
||||
|
||||
// draw connectors first (behind cards)
|
||||
// top row: 0->1->2 (left to right). then 2->3 (down). bottom row 3->4->5 (right to left).
|
||||
function center(i){const p=slot(i);return {cx:p.x+CW/2, cy:p.y+CH/2, x:p.x, y:p.y};}
|
||||
|
||||
// 0 -> 1 (right edge -> left edge)
|
||||
arrow(center(0).x+CW, center(0).cy, center(1).x, center(1).cy, INK);
|
||||
// 1 -> 2
|
||||
arrow(center(1).x+CW, center(1).cy, center(2).x, center(2).cy, INK);
|
||||
// 2 -> 3 (vertical drop, both in rightmost column)
|
||||
(function(){
|
||||
const a=center(2), b=center(3);
|
||||
const x=a.x+CW/2;
|
||||
svg.appendChild(el("path",{d:`M ${x} ${a.y+CH} L ${x} ${b.y-11}`,fill:"none",stroke:RUSTD,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
const k=7;
|
||||
svg.appendChild(el("path",{d:`M ${x} ${b.y-11} l -${k+4} -${k} M ${x} ${b.y-11} l ${k+4} -${k}`,fill:"none",stroke:RUSTD,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
})();
|
||||
// 3 -> 4 (right to left): from left edge of 3 to right edge of 4
|
||||
arrow(center(3).x, center(3).cy, center(4).x+CW+22, center(4).cy, RUST);
|
||||
// 4 -> 5
|
||||
arrow(center(4).x, center(4).cy, center(5).x+CW+22, center(5).cy, RUST);
|
||||
|
||||
// draw cards
|
||||
steps.forEach((st,i)=>{
|
||||
const p=slot(i);
|
||||
const fill = (st.col===RUST) ? "#EFE0BF" : PAPER2;
|
||||
shadowRect(p.x, p.y, CW, CH, fill, INK, 4);
|
||||
// colored title bar
|
||||
svg.appendChild(el("rect",{x:p.x,y:p.y,width:CW,height:62,fill:st.col}));
|
||||
svg.appendChild(el("line",{x1:p.x,y1:p.y+62,x2:p.x+CW,y2:p.y+62,stroke:INK,"stroke-width":4}));
|
||||
// step number on bar
|
||||
txt(p.x+24, p.y+42, st.n, {f:"Bricolage Grotesque",w:800,sz:30,fill:PAPER});
|
||||
txt(p.x+82, p.y+42, st.t, {f:"Bricolage Grotesque",w:800,sz:30,fill:PAPER});
|
||||
// subtitle
|
||||
txt(p.x+26, p.y+118, st.s, {f:"Bricolage Grotesque",w:700,sz:30,fill:INK});
|
||||
// tool tag bottom-right
|
||||
const tw=Math.max(140, st.tool.length*12+34), th=34, tx=p.x+CW-tw-22, ty=p.y+CH-th-20;
|
||||
svg.appendChild(el("rect",{x:tx,y:ty,width:tw,height:th,fill:PAPER,stroke:INK,"stroke-width":2.5}));
|
||||
txt(tx+tw/2, ty+24, st.tool, {w:700,sz:16,ls:".04em",a:"middle",fill:(st.col===RUST)?RUSTD:COLD});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/finetune-recipe.png
vendored
Normal file
|
After Width: | Height: | Size: 213 KiB |
183
docs/static/images/diagrams/mcp-server-vs-client.html
vendored
Normal file
@@ -0,0 +1,183 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> MCP</div>
|
||||
<h1>Server-side vs <em>client-side</em> tools</h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">TWO</div>
|
||||
<div class="s">loops</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">The model's tool loop runs on the server, or in the browser - <b>same chat API.</b></div>
|
||||
<div class="url">localai.io<span>/features/mcp</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", COLDD="#2D5054", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// arrowhead helper that points along an arbitrary direction (angle in radians)
|
||||
function head(x,y,ang,color){
|
||||
const a=8;
|
||||
const dx1=Math.cos(ang+2.6)*(a+5), dy1=Math.sin(ang+2.6)*(a+5);
|
||||
const dx2=Math.cos(ang-2.6)*(a+5), dy2=Math.sin(ang-2.6)*(a+5);
|
||||
svg.appendChild(el("path",{d:`M ${x} ${y} l ${dx1} ${dy1} M ${x} ${y} l ${dx2} ${dy2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// straight connector with arrowhead at (x2,y2)
|
||||
function line2(x1,y1,x2,y2,color,dash){
|
||||
svg.appendChild(el("line",{x1,y1,x2,y2,stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
head(x2,y2,Math.atan2(y2-y1,x2-x1),color);
|
||||
}
|
||||
|
||||
// ===================== PANEL FRAMES =====================
|
||||
const PW=700, PH=540, PY=10;
|
||||
const LX=10, RX=770;
|
||||
// Left panel (rust / server)
|
||||
shadowRect(LX,PY,PW,PH,PAPER,RUSTD,4);
|
||||
svg.appendChild(el("rect",{x:LX,y:PY,width:PW,height:58,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:LX,y1:PY+58,x2:LX+PW,y2:PY+58,stroke:INK,"stroke-width":4}));
|
||||
txt(LX+26,PY+38,"Server-side MCP",{f:"Bricolage Grotesque",w:800,sz:28,fill:PAPER});
|
||||
txt(LX+PW-26,PY+37,"loop on the server",{w:700,sz:13,ls:".06em",a:"end",fill:"#F1D9C8"});
|
||||
|
||||
// Right panel (cold / client)
|
||||
shadowRect(RX,PY,PW,PH,PAPER,COLDD,4);
|
||||
svg.appendChild(el("rect",{x:RX,y:PY,width:PW,height:58,fill:COLD}));
|
||||
svg.appendChild(el("line",{x1:RX,y1:PY+58,x2:RX+PW,y2:PY+58,stroke:INK,"stroke-width":4}));
|
||||
txt(RX+26,PY+38,"Client-side MCP",{f:"Bricolage Grotesque",w:800,sz:28,fill:PAPER});
|
||||
txt(RX+PW-26,PY+37,"loop in the browser",{w:700,sz:13,ls:".06em",a:"end",fill:"#DCEAE9"});
|
||||
|
||||
// ===================== generic node box =====================
|
||||
function nodeBox(x,y,w,h,fill,stroke,title,sub){
|
||||
svg.appendChild(el("rect",{x:x+5,y:y+5,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":3}));
|
||||
txt(x+w/2,y+(sub?h/2-3:h/2+7),title,{f:"Bricolage Grotesque",w:800,sz:20,a:"middle"});
|
||||
if(sub) txt(x+w/2,y+h/2+19,sub,{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
}
|
||||
|
||||
// ===================== LEFT: server-side cycle =====================
|
||||
// inner "runs here" container highlighting where the loop lives
|
||||
const ScX=LX+34, ScY=PY+92, ScW=PW-68, ScH=PH-150;
|
||||
svg.appendChild(el("rect",{x:ScX,y:ScY,width:ScW,height:ScH,fill:"none",stroke:RUSTD,"stroke-width":2.5,"stroke-dasharray":"6 7"}));
|
||||
txt(ScX+14,ScY+24,"RUNS ON THE LocalAI SERVER",{w:700,sz:12,ls:".12em",fill:RUSTD});
|
||||
|
||||
// nodes (a rectangular cycle)
|
||||
const bw=200, bh=64;
|
||||
const modL = {x:LX+PW/2-bw/2, y:ScY+44}; // top: model
|
||||
const toolL= {x:ScX+24, y:ScY+ScH-bh-30}; // bottom-left: tool exec (emphasis)
|
||||
const resL = {x:ScX+ScW-bw-24,y:ScY+ScH-bh-30}; // bottom-right: result
|
||||
nodeBox(modL.x,modL.y,bw,bh,HI,INK,"Model","generates");
|
||||
// emphasized tool box
|
||||
svg.appendChild(el("rect",{x:toolL.x+5,y:toolL.y+5,width:bw,height:bh,fill:INK}));
|
||||
svg.appendChild(el("rect",{x:toolL.x,y:toolL.y,width:bw,height:bh,fill:RUST,stroke:INK,"stroke-width":3}));
|
||||
txt(toolL.x+bw/2,toolL.y+27,"MCP tool runs",{f:"Bricolage Grotesque",w:800,sz:19,a:"middle",fill:PAPER});
|
||||
txt(toolL.x+bw/2,toolL.y+48,"on the server",{w:700,sz:13,a:"middle",fill:"#F1D9C8"});
|
||||
nodeBox(resL.x,resL.y,bw,bh,PAPER2,INK,"Result","fed back");
|
||||
|
||||
// arrows: model -> tool (down-left), tool -> result (right), result -> model (up-left back to model)
|
||||
// model bottom-left to tool top
|
||||
line2(modL.x+20, modL.y+bh, toolL.x+bw/2, toolL.y-2, RUSTD);
|
||||
txt(LX+150, ScY+ScH/2+6, "emits tool call", {w:700,sz:14,a:"middle",fill:RUSTD});
|
||||
// tool -> result
|
||||
line2(toolL.x+bw, toolL.y+bh/2, resL.x-2, resL.y+bh/2, RUSTD);
|
||||
txt((toolL.x+bw+resL.x)/2, toolL.y+bh/2-12, "execute", {w:700,sz:14,a:"middle",fill:RUSTD});
|
||||
// result -> model
|
||||
line2(resL.x+bw-20, resL.y, modL.x+bw, modL.y+bh, RUSTD);
|
||||
txt(LX+PW-150, ScY+ScH/2+6, "result back", {w:700,sz:14,a:"middle",fill:RUSTD});
|
||||
|
||||
// loop badge
|
||||
const lbW=210, lbH=34, lbx=LX+PW/2-lbW/2, lby=ScY+ScH/2-lbH/2;
|
||||
svg.appendChild(el("rect",{x:lbx,y:lby,width:lbW,height:lbH,fill:PAPER,stroke:RUSTD,"stroke-width":2.5}));
|
||||
txt(LX+PW/2,lby+23,"up to max_iterations",{f:"Bricolage Grotesque",w:800,sz:16,a:"middle",fill:RUSTD});
|
||||
|
||||
// ===================== RIGHT: client-side cycle =====================
|
||||
// browser connects banner at top of panel content
|
||||
const cbX=RX+34, cbY=PY+78, cbW=PW-68, cbH=40;
|
||||
svg.appendChild(el("rect",{x:cbX,y:cbY,width:cbW,height:cbH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(RX+PW/2,cbY+26,"Browser connects to MCP server",{f:"Bricolage Grotesque",w:700,sz:18,a:"middle"});
|
||||
|
||||
// inner "runs here" container
|
||||
const TcX=RX+34, TcY=PY+138, TcW=PW-68, TcH=PH-196;
|
||||
svg.appendChild(el("rect",{x:TcX,y:TcY,width:TcW,height:TcH,fill:"none",stroke:COLDD,"stroke-width":2.5,"stroke-dasharray":"6 7"}));
|
||||
txt(TcX+14,TcY+24,"RUNS IN THE BROWSER",{w:700,sz:12,ls:".12em",fill:COLDD});
|
||||
|
||||
const modR = {x:RX+PW/2-bw/2, y:TcY+40};
|
||||
const toolR= {x:TcX+24, y:TcY+TcH-bh-26};
|
||||
const resR = {x:TcX+TcW-bw-24,y:TcY+TcH-bh-26};
|
||||
nodeBox(modR.x,modR.y,bw,bh,HI,INK,"Model","generates");
|
||||
// emphasized browser-exec box (cold)
|
||||
svg.appendChild(el("rect",{x:toolR.x+5,y:toolR.y+5,width:bw,height:bh,fill:INK}));
|
||||
svg.appendChild(el("rect",{x:toolR.x,y:toolR.y,width:bw,height:bh,fill:COLD,stroke:INK,"stroke-width":3}));
|
||||
txt(toolR.x+bw/2,toolR.y+27,"Browser runs tool",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",fill:PAPER});
|
||||
txt(toolR.x+bw/2,toolR.y+48,"via CORS proxy",{w:700,sz:13,a:"middle",fill:"#DCEAE9"});
|
||||
nodeBox(resR.x,resR.y,bw,bh,PAPER2,INK,"Result","fed back");
|
||||
|
||||
line2(modR.x+20, modR.y+bh, toolR.x+bw/2, toolR.y-2, COLDD);
|
||||
txt(RX+150, TcY+TcH/2+6, "emits tool call", {w:700,sz:14,a:"middle",fill:COLDD});
|
||||
line2(toolR.x+bw, toolR.y+bh/2, resR.x-2, resR.y+bh/2, COLDD);
|
||||
txt((toolR.x+bw+resR.x)/2, toolR.y+bh/2-12, "execute", {w:700,sz:14,a:"middle",fill:COLDD});
|
||||
line2(resR.x+bw-20, resR.y, modR.x+bw, modR.y+bh, COLDD);
|
||||
txt(RX+PW-150, TcY+TcH/2+6, "result back", {w:700,sz:14,a:"middle",fill:COLDD});
|
||||
|
||||
const rbW=200, rbH=34, rbx=RX+PW/2-rbW/2, rby=TcY+TcH/2-rbH/2;
|
||||
svg.appendChild(el("rect",{x:rbx,y:rby,width:rbW,height:rbH,fill:PAPER,stroke:COLDD,"stroke-width":2.5}));
|
||||
txt(RX+PW/2,rby+23,"same chat API",{f:"Bricolage Grotesque",w:800,sz:16,a:"middle",fill:COLDD});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/mcp-server-vs-client.png
vendored
Normal file
|
After Width: | Height: | Size: 265 KiB |
159
docs/static/images/diagrams/middleware-lifecycle.html
vendored
Normal file
@@ -0,0 +1,159 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Middleware</div>
|
||||
<h1>The request <em>lifecycle</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">HOOK</div>
|
||||
<div class="s">chain</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">One shared hook chain: <b>auth, model routing, and PII</b>, with decision and event logs.</div>
|
||||
<div class="url">localai.io<span>/features/middleware</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// vertical arrow (straight down)
|
||||
function arrowDown(x,y1,y2,color,dash){
|
||||
svg.appendChild(el("path",{d:`M ${x} ${y1} L ${x} ${y2-11}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x} ${y2-11} l -${a} -${a+4} M ${x} ${y2-11} l ${a} -${a+4}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ===== PIPELINE STAGES =====
|
||||
// label
|
||||
txt(20,40,"REQUEST PIPELINE",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
|
||||
// node geometry: 7 nodes across. client endpoints are cold; hook chain stages are paper/hi; backend is rust.
|
||||
const ROW_Y=120, NH=120;
|
||||
// columns laid out within 1480 viewBox
|
||||
const nodes=[
|
||||
{x:20, w:150, fill:PAPER2, name:"client", sub:"request", kind:"end"},
|
||||
{x:210, w:178, fill:HI, name:"auth", sub:"API key · access", kind:"hook"},
|
||||
{x:428, w:220, fill:HI, name:"route model", sub:"may rewrite input.Model", kind:"hook"},
|
||||
{x:688, w:200, fill:HI, name:"per-model PII", sub:"redact input", kind:"hook"},
|
||||
{x:928, w:178, fill:RUST, name:"backend", sub:"model runs", kind:"backend"},
|
||||
{x:1146, w:188, fill:HI, name:"streaming PII", sub:"redact output", kind:"hook"},
|
||||
{x:1374, w:86, fill:PAPER2, name:"client", sub:"response", kind:"end2"},
|
||||
];
|
||||
|
||||
nodes.forEach(n=>{
|
||||
const stroke = n.kind==="backend" ? INK : INK;
|
||||
shadowRect(n.x,ROW_Y,n.w,NH,n.fill,stroke,n.kind==="backend"?4:3.5);
|
||||
const nameFill = n.kind==="backend" ? PAPER : INK;
|
||||
const subFill = n.kind==="backend" ? "#F1D9C8" : SOFT;
|
||||
// wrap name if needed
|
||||
if(n.name.includes(" ")){
|
||||
const parts=n.name.split(" ");
|
||||
txt(n.x+n.w/2,ROW_Y+50,parts[0],{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:nameFill});
|
||||
txt(n.x+n.w/2,ROW_Y+78,parts.slice(1).join(" "),{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:nameFill});
|
||||
txt(n.x+n.w/2,ROW_Y+104,n.sub,{w:700,sz:13,a:"middle",fill:subFill});
|
||||
} else {
|
||||
txt(n.x+n.w/2,ROW_Y+62,n.name,{f:"Bricolage Grotesque",w:800,sz:25,a:"middle",fill:nameFill});
|
||||
txt(n.x+n.w/2,ROW_Y+92,n.sub,{w:700,sz:13,a:"middle",fill:subFill});
|
||||
}
|
||||
});
|
||||
|
||||
// ===== HOOK CHAIN bracket (under the four middleware stages) =====
|
||||
const chainStart=nodes[1].x, chainEnd=nodes[5].x+nodes[5].w;
|
||||
const braceY=ROW_Y+NH+34;
|
||||
svg.appendChild(el("line",{x1:chainStart,y1:braceY,x2:chainEnd,y2:braceY,stroke:RUSTD,"stroke-width":3,"stroke-dasharray":"3 8"}));
|
||||
svg.appendChild(el("line",{x1:chainStart,y1:braceY-10,x2:chainStart,y2:braceY+10,stroke:RUSTD,"stroke-width":3}));
|
||||
svg.appendChild(el("line",{x1:chainEnd,y1:braceY-10,x2:chainEnd,y2:braceY+10,stroke:RUSTD,"stroke-width":3}));
|
||||
// chain label badge
|
||||
const lbW=210,lbH=32,lbx=(chainStart+chainEnd)/2-lbW/2,lby=braceY-lbH/2;
|
||||
svg.appendChild(el("rect",{x:lbx,y:lby,width:lbW,height:lbH,fill:PAPER,stroke:RUSTD,"stroke-width":2.5}));
|
||||
txt((chainStart+chainEnd)/2,braceY+6,"SHARED HOOK CHAIN",{f:"Bricolage Grotesque",w:800,sz:16,a:"middle",ls:".04em",fill:RUSTD});
|
||||
|
||||
// ===== HORIZONTAL ARROWS between nodes =====
|
||||
const midY=ROW_Y+NH/2;
|
||||
for(let i=0;i<nodes.length-1;i++){
|
||||
const a=nodes[i], b=nodes[i+1];
|
||||
// backend boundary (gRPC) into and out of backend dashed; others solid
|
||||
const intoBackend = b.kind==="backend";
|
||||
const outBackend = a.kind==="backend";
|
||||
const dash = (intoBackend||outBackend) ? "2 8" : "none";
|
||||
const color = (intoBackend||outBackend) ? RUSTD : INK;
|
||||
arrow(a.x+a.w, midY, b.x, midY, color, dash);
|
||||
}
|
||||
|
||||
// ===== SIDE-CHANNEL LOG BOXES (downward) =====
|
||||
const logY=braceY+76, logH=88;
|
||||
const logs=[
|
||||
{name:"decision log", sub:"auth · routing", srcNode:2},
|
||||
{name:"event log", sub:"PII · backend", srcNode:4},
|
||||
];
|
||||
logs.forEach(l=>{
|
||||
const src=nodes[l.srcNode];
|
||||
const cx=src.x+src.w/2;
|
||||
const lw=210, lx=cx-lw/2;
|
||||
shadowRect(lx,logY,lw,logH,PAPER2,COLD,3.5);
|
||||
txt(cx,logY+42,l.name,{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:INK});
|
||||
txt(cx,logY+68,l.sub,{w:700,sz:13,a:"middle",fill:COLD});
|
||||
// arrow from chain brace down to box
|
||||
arrowDown(cx,braceY+14,logY,COLD,"2 8");
|
||||
});
|
||||
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/middleware-lifecycle.png
vendored
Normal file
|
After Width: | Height: | Size: 189 KiB |
185
docs/static/images/diagrams/mitm-intercept.html
vendored
Normal file
@@ -0,0 +1,185 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> MITM Proxy</div>
|
||||
<h1>Inspect what you allow, <em>tunnel the rest</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">TLS</div>
|
||||
<div class="s">selective</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Allowlisted hosts are decrypted and scanned; <b>everything else is a blind TCP tunnel.</b></div>
|
||||
<div class="url">localai.io<span>/features/mitm-proxy</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// straight-line connector with arrowhead, direction-aware
|
||||
function line(x1,y1,x2,y2,color,dash){
|
||||
svg.appendChild(el("line",{x1,y1,x2,y2,stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const ang=Math.atan2(y2-y1,x2-x1), a=11, sp=0.5;
|
||||
const bx=x2-Math.cos(ang)*0, by=y2-Math.sin(ang)*0;
|
||||
svg.appendChild(el("path",{d:`M ${bx} ${by} L ${bx-Math.cos(ang-sp)*a} ${by-Math.sin(ang-sp)*a} M ${bx} ${by} L ${bx-Math.cos(ang+sp)*a} ${by-Math.sin(ang+sp)*a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ============ CLIENT (left) ============
|
||||
const cx=24, cy=232, cw=178, ch=96;
|
||||
shadowRect(cx,cy,cw,ch,PAPER2);
|
||||
txt(cx+cw/2,cy+44,"client",{f:"Bricolage Grotesque",w:800,sz:26,a:"middle"});
|
||||
txt(cx+cw/2,cy+72,"CONNECT host:443",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// ============ DECISION DIAMOND ============
|
||||
const dcx=370, dcy=280, dr=92;
|
||||
// shadow
|
||||
svg.appendChild(el("path",{d:`M ${dcx+7} ${dcy-dr+7} L ${dcx+dr+7} ${dcy+7} L ${dcx+7} ${dcy+dr+7} L ${dcx-dr+7} ${dcy+7} Z`,fill:INK}));
|
||||
svg.appendChild(el("path",{d:`M ${dcx} ${dcy-dr} L ${dcx+dr} ${dcy} L ${dcx} ${dcy+dr} L ${dcx-dr} ${dcy} Z`,fill:HI,stroke:INK,"stroke-width":3.5}));
|
||||
txt(dcx,dcy-6,"host",{f:"Bricolage Grotesque",w:800,sz:21,a:"middle"});
|
||||
txt(dcx,dcy+22,"allowlisted?",{f:"Bricolage Grotesque",w:800,sz:21,a:"middle"});
|
||||
|
||||
// client -> diamond
|
||||
line(cx+cw, cy+ch/2, dcx-dr-2, dcy, INK);
|
||||
|
||||
// YES / NO labels
|
||||
txt(dcx+18,dcy-dr-12,"YES",{f:"Bricolage Grotesque",w:800,sz:17,fill:RUSTD});
|
||||
txt(dcx-46,dcy+dr+34,"NO",{f:"Bricolage Grotesque",w:800,sz:17,fill:COLD});
|
||||
|
||||
// ============ NO BRANCH (cold teal) - down ============
|
||||
const ntx=255, nty=448, ntw=288, nth=78;
|
||||
// connector from diamond bottom down to tunnel box
|
||||
line(dcx, dcy+dr, ntx+ntw/2, nty, COLD);
|
||||
shadowRect(ntx,nty,ntw,nth,PAPER,COLD,3.5);
|
||||
txt(ntx+ntw/2,nty+33,"plain TCP tunnel",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:COLD});
|
||||
txt(ntx+ntw/2,nty+58,"no inspection",{w:700,sz:15,a:"middle",fill:SOFT});
|
||||
|
||||
// ============ YES BRANCH (rust) - horizontal chain across top ============
|
||||
const yY=120, yH=92, yW=196;
|
||||
const steps=[
|
||||
{x:520, t1:"mint", t2:"leaf cert"},
|
||||
{x:520, t1:"terminate", t2:"TLS"},
|
||||
{x:520, t1:"PII scan", t2:""},
|
||||
{x:520, t1:"re-encrypt", t2:"to upstream"},
|
||||
];
|
||||
// lay out 4 steps left->right with gaps
|
||||
const startX=512, gapX=42;
|
||||
steps.forEach((s,i)=>{ s.x = startX + i*(yW+gapX); });
|
||||
|
||||
steps.forEach((s,i)=>{
|
||||
shadowRect(s.x,yY,yW,yH,"#EFE0BF",RUST,3.5);
|
||||
});
|
||||
// labels (special for PII scan box, two-line endpoint detail)
|
||||
txt(steps[0].x+yW/2,yY+42,"mint",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle"});
|
||||
txt(steps[0].x+yW/2,yY+68,"leaf cert",{w:700,sz:15,a:"middle",fill:SOFT});
|
||||
txt(steps[1].x+yW/2,yY+42,"terminate TLS",{f:"Bricolage Grotesque",w:800,sz:21,a:"middle"});
|
||||
txt(steps[1].x+yW/2,yY+68,"decrypt stream",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
txt(steps[2].x+yW/2,yY+38,"PII scan",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle"});
|
||||
txt(steps[2].x+yW/2,yY+62,"/v1/messages ·",{w:700,sz:12.5,a:"middle",fill:SOFT});
|
||||
txt(steps[2].x+yW/2,yY+79,"/v1/chat/completions",{w:700,sz:12.5,a:"middle",fill:SOFT});
|
||||
txt(steps[3].x+yW/2,yY+42,"re-encrypt",{f:"Bricolage Grotesque",w:800,sz:21,a:"middle"});
|
||||
txt(steps[3].x+yW/2,yY+68,"to upstream",{w:700,sz:15,a:"middle",fill:SOFT});
|
||||
|
||||
// connector diamond top -> first YES step (up then across)
|
||||
line(dcx, dcy-dr, dcx, yY+yH/2, RUST);
|
||||
line(dcx, yY+yH/2, steps[0].x-2, yY+yH/2, RUST);
|
||||
// chain arrows between steps
|
||||
for(let i=0;i<steps.length-1;i++){
|
||||
line(steps[i].x+yW, yY+yH/2, steps[i+1].x-2, yY+yH/2, RUST);
|
||||
}
|
||||
|
||||
// ============ UPSTREAM (right) ============
|
||||
const ux=1276, uy=240, uw=184, uh=92;
|
||||
shadowRect(ux,uy,uw,uh,PAPER2);
|
||||
txt(ux+uw/2,uy+42,"upstream",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(ux+uw/2,uy+70,"OpenAI · API host",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// last YES step -> upstream (down then into top edge)
|
||||
const lastX = steps[3].x+yW/2;
|
||||
const elbowY = uy-22;
|
||||
line(lastX, yY+yH, lastX, elbowY, RUST);
|
||||
line(lastX, elbowY, ux+uw/2, elbowY, RUST);
|
||||
line(ux+uw/2, elbowY, ux+uw/2, uy-2, RUST);
|
||||
|
||||
// NO tunnel -> upstream (cold, dashed) - route below the trust-chain box
|
||||
const noElbowX = ux+uw/2;
|
||||
line(ntx+ntw, nty+nth/2, noElbowX, nty+nth/2, COLD, "2 9");
|
||||
line(noElbowX, nty+nth/2, noElbowX, uy+uh+2, COLD, "2 9");
|
||||
|
||||
// ============ TRUST CHAIN (bottom-right corner) ============
|
||||
const tcX=980, tcY=372, tcW=372, tcH=150;
|
||||
svg.appendChild(el("rect",{x:tcX,y:tcY,width:tcW,height:tcH,fill:PAPER,stroke:INK,"stroke-width":2.5,"stroke-dasharray":"4 7"}));
|
||||
txt(tcX+18,tcY+30,"TRUST CHAIN",{w:700,sz:13,ls:".2em",fill:SOFT});
|
||||
|
||||
// CA -> leaf
|
||||
const caX=tcX+30, caY=tcY+52, caW=132, caH=46;
|
||||
shadowRect(caX,caY,caW,caH,HI,INK,3);
|
||||
txt(caX+caW/2,caY+30,"local CA",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle"});
|
||||
const lfX=tcX+212, lfY=caY, lfW=132, lfH=46;
|
||||
shadowRect(lfX,lfY,lfW,lfH,"#EFE0BF",RUST,3);
|
||||
txt(lfX+lfW/2,lfY+30,"leaf cert",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",fill:RUSTD});
|
||||
line(caX+caW, caY+caH/2, lfX-2, lfY+lfH/2, RUSTD);
|
||||
txt(tcX+tcW/2, caY+caH/2-12, "signs",{w:700,sz:12,a:"middle",fill:SOFT});
|
||||
|
||||
// note inside trust chain
|
||||
txt(tcX+18,tcY+tcH-26,"the client holds its own",{w:600,sz:14,fill:SOFT});
|
||||
txt(tcX+18,tcY+tcH-9,"upstream credential",{w:700,sz:14,fill:INK});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/mitm-intercept.png
vendored
Normal file
|
After Width: | Height: | Size: 237 KiB |
134
docs/static/images/diagrams/mlx-pipeline.html
vendored
Normal file
@@ -0,0 +1,134 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> MLX Distributed</div>
|
||||
<h1>Pipeline-parallel <em>across ranks</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">RING</div>
|
||||
<div class="s">TCP</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Layers split across ranks; <b>rank 0 coordinates</b>, activations flow down the ring.</div>
|
||||
<div class="url">localai.io<span>/features/mlx-distributed</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- RANK ROW ----------
|
||||
txt(20,40,"RANKS · PIPELINE STAGES",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
|
||||
const RW=380, RH=210, RY=78;
|
||||
const RXS=[40, 550, 1060];
|
||||
const ranks=[
|
||||
{n:"rank 0", role:"LocalAI gRPC · coordinator", slice:"layers 0–9", primary:true},
|
||||
{n:"rank 1", role:"worker", slice:"layers 10–19"},
|
||||
{n:"rank 2", role:"worker", slice:"layers 20–out"},
|
||||
];
|
||||
|
||||
ranks.forEach((r,i)=>{
|
||||
const x=RXS[i];
|
||||
shadowRect(x,RY,RW,RH,PAPER,INK,4);
|
||||
// title bar
|
||||
const barH=58, barFill=r.primary?RUST:COLD;
|
||||
svg.appendChild(el("rect",{x,y:RY,width:RW,height:barH,fill:barFill}));
|
||||
svg.appendChild(el("line",{x1:x,y1:RY+barH,x2:x+RW,y2:RY+barH,stroke:INK,"stroke-width":4}));
|
||||
txt(x+24,RY+39,r.n,{f:"Bricolage Grotesque",w:800,sz:30,fill:PAPER});
|
||||
txt(x+RW-22,RY+37,r.primary?"COORDINATOR":"WORKER",{w:700,sz:13,ls:".08em",a:"end",fill:"#F1D9C8"});
|
||||
// role line
|
||||
txt(x+24,RY+98,r.role,{f:"Bricolage Grotesque",w:700,sz:21,fill:INK});
|
||||
// layer slice chip
|
||||
const cw=200,ch=56,cx=x+24,cy=RY+122;
|
||||
svg.appendChild(el("rect",{x:cx,y:cy,width:cw,height:ch,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(cx+18,cy+25,"layer slice",{w:700,sz:13,ls:".06em",fill:SOFT});
|
||||
txt(cx+18,cy+47,r.slice,{f:"Bricolage Grotesque",w:800,sz:22,fill:r.primary?RUSTD:INK});
|
||||
});
|
||||
|
||||
// ---------- FORWARD ACTIVATION ARROWS (left -> right) ----------
|
||||
const midY=RY+RH/2-8;
|
||||
arrow(RXS[0]+RW, midY, RXS[1], midY, RUST);
|
||||
arrow(RXS[1]+RW, midY, RXS[2], midY, RUST);
|
||||
txt((RXS[0]+RW+RXS[1])/2, midY-14, "activations", {w:700,sz:14,a:"middle",fill:RUSTD});
|
||||
txt((RXS[1]+RW+RXS[2])/2, midY-14, "activations", {w:700,sz:14,a:"middle",fill:RUSTD});
|
||||
|
||||
// ---------- RETURN ARROW (rank 2 -> rank 0, gather output) ----------
|
||||
const rTop=RY+RH; // bottom of rank boxes
|
||||
const ry=rTop+78; // return-path y
|
||||
const x2c=RXS[2]+RW/2; // rank2 center x
|
||||
const x0c=RXS[0]+RW/2; // rank0 center x
|
||||
// down from rank2
|
||||
svg.appendChild(el("line",{x1:x2c,y1:rTop+7,x2:x2c,y2:ry,stroke:COLD,"stroke-width":3.5,"stroke-dasharray":"2 9","stroke-linecap":"round"}));
|
||||
// long horizontal back to rank0
|
||||
svg.appendChild(el("line",{x1:x2c,y1:ry,x2:x0c,y2:ry,stroke:COLD,"stroke-width":3.5,"stroke-dasharray":"2 9","stroke-linecap":"round"}));
|
||||
// up into rank0 with arrowhead
|
||||
svg.appendChild(el("path",{d:`M ${x0c} ${ry} L ${x0c} ${rTop+18}`,fill:"none",stroke:COLD,"stroke-width":3.5,"stroke-dasharray":"2 9","stroke-linecap":"round"}));
|
||||
svg.appendChild(el("path",{d:`M ${x0c} ${rTop+11} l -7 11 M ${x0c} ${rTop+11} l 7 11`,fill:"none",stroke:COLD,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
txt((x2c+x0c)/2, ry-12, "gather output → rank 0", {w:700,sz:15,a:"middle",fill:COLD});
|
||||
|
||||
// ---------- JACCL INSET ----------
|
||||
const jw=470, jh=78, jx=RXS[2]+RW-jw, jy=ry+30;
|
||||
shadowRect(jx,jy,jw,jh,PAPER2,INK,3.5,"4 7");
|
||||
txt(jx+22,jy+30,"JACCL VARIANT",{w:700,sz:13,ls:".12em",fill:RUSTD});
|
||||
txt(jx+22,jy+57,"full layers, sharded weights, coordinator on rank 0",{f:"Bricolage Grotesque",w:700,sz:18,fill:INK});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/mlx-pipeline.png
vendored
Normal file
|
After Width: | Height: | Size: 227 KiB |
148
docs/static/images/diagrams/model-resolution.html
vendored
Normal file
@@ -0,0 +1,148 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Models</div>
|
||||
<h1>Many sources, <em>one load path</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">AUTO</div>
|
||||
<div class="s">detect</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">However you point at a model, it lands on the same <b>resolve → backend → load</b> path.</div>
|
||||
<div class="url">localai.io<span>/getting-started/models</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- INPUT SOURCES (left) ----------
|
||||
txt(20,42,"SOURCES",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
const sources=[
|
||||
{n:"gallery name", s:"localai run llama"},
|
||||
{n:"huggingface://", s:"hub repo + file"},
|
||||
{n:"oci:// · ollama://", s:"registry pull"},
|
||||
{n:"manual file / YAML", s:"local model config"},
|
||||
];
|
||||
const SX=24, SW=288, SH=92, sGap=42;
|
||||
const sTop=58;
|
||||
const srcY=[];
|
||||
sources.forEach((c,i)=>{
|
||||
const y=sTop+i*(SH+sGap);
|
||||
srcY.push(y);
|
||||
shadowRect(SX,y,SW,SH,PAPER2,COLD,3.5);
|
||||
txt(SX+20,y+42,c.n,{f:"Bricolage Grotesque",w:800,sz:25,fill:INK});
|
||||
txt(SX+20,y+72,c.s,{w:700,sz:15,fill:SOFT});
|
||||
});
|
||||
|
||||
// ---------- CONVERGENCE POINT ----------
|
||||
const convX=512; // where arrows converge / pipeline begins
|
||||
const convY=280; // vertical center of pipeline
|
||||
|
||||
// ---------- PIPELINE (right, single load path) ----------
|
||||
const stages=[
|
||||
{n:"resolve", s:"locate source"},
|
||||
{n:"auto-detect",s:"match by format"},
|
||||
{n:"load", s:"start process"},
|
||||
{n:"serve", s:"ready · OpenAI API"},
|
||||
];
|
||||
const PW=200, PH=130, pGap=42;
|
||||
const pStart=540;
|
||||
const pY=convY-PH/2;
|
||||
const pX=[];
|
||||
stages.forEach((st,i)=> pX.push(pStart+i*(PW+pGap)) );
|
||||
|
||||
// connector line behind the pipeline boxes
|
||||
svg.appendChild(el("line",{x1:convX,y1:convY,x2:pX[stages.length-1]+PW,y2:convY,stroke:RUSTD,"stroke-width":3.5}));
|
||||
|
||||
// arrows from each source into the convergence point
|
||||
const cw=4;
|
||||
sources.forEach((c,i)=>{
|
||||
arrow(SX+SW, srcY[i]+SH/2, convX, convY, RUST);
|
||||
});
|
||||
|
||||
// convergence node (small junction)
|
||||
svg.appendChild(el("circle",{cx:convX,cy:convY,r:9,fill:RUST,stroke:INK,"stroke-width":3}));
|
||||
|
||||
// pipeline stage boxes (emphasis: rust)
|
||||
stages.forEach((st,i)=>{
|
||||
const x=pX[i], emph=(i===stages.length-1);
|
||||
shadowRect(x,pY,PW,PH,emph?RUST:HI,INK,4);
|
||||
txt(x+PW/2,pY+58,st.n,{f:"Bricolage Grotesque",w:800,sz:emph?27:24,a:"middle",fill:emph?PAPER:INK});
|
||||
txt(x+PW/2,pY+92,st.s,{w:700,sz:15,a:"middle",fill:emph?"#F1D9C8":SOFT});
|
||||
// step number badge
|
||||
const bw=34,bh=26,bx=x+14,by=pY+14;
|
||||
svg.appendChild(el("rect",{x:bx,y:by,width:bw,height:bh,fill:emph?PAPER:PAPER,stroke:INK,"stroke-width":2}));
|
||||
txt(bx+bw/2,by+19,(i+1),{f:"Bricolage Grotesque",w:800,sz:16,a:"middle",fill:RUSTD});
|
||||
});
|
||||
|
||||
// arrows between pipeline stages
|
||||
for(let i=0;i<stages.length-1;i++){
|
||||
arrow(pX[i]+PW, convY, pX[i+1], convY, RUSTD);
|
||||
}
|
||||
// arrow from convergence node into first stage
|
||||
arrow(convX+9, convY, pX[0], convY, RUSTD);
|
||||
|
||||
// label above pipeline
|
||||
txt(pStart, pY-22, "ONE LOAD PATH", {w:700,sz:14,ls:".2em",fill:RUSTD});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/model-resolution.png
vendored
Normal file
|
After Width: | Height: | Size: 228 KiB |
180
docs/static/images/diagrams/quantization-flow.html
vendored
Normal file
@@ -0,0 +1,180 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Quantization</div>
|
||||
<h1>From HF model to <em>quantized GGUF</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">GGUF</div>
|
||||
<div class="s">q4..q8</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Convert first, then quantize - <b>tracked as a job from queued to completed.</b></div>
|
||||
<div class="url">localai.io<span>/features/quantization</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ===================== PIPELINE (top) =====================
|
||||
txt(20,40,"PIPELINE",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
|
||||
const PY=70, PH=150;
|
||||
// box geometry
|
||||
const boxes=[
|
||||
{x:20, w:250, fill:PAPER2, title:"HF model", sub:"safetensors · repo", bar:null},
|
||||
{x:360, w:250, fill:HI, title:"f16 GGUF", sub:"converted weights", bar:"CONVERT"},
|
||||
{x:700, w:250, fill:HI, title:"quantize", sub:"reduce precision", bar:"QUANTIZE"},
|
||||
{x:1040, w:420, fill:PAPER, title:"GGUF", sub:"quantized output", bar:"OUTPUT", emph:true},
|
||||
];
|
||||
|
||||
boxes.forEach((b)=>{
|
||||
if(b.emph){
|
||||
shadowRect(b.x,PY,b.w,PH,PAPER,RUST,4.5);
|
||||
} else {
|
||||
shadowRect(b.x,PY,b.w,PH,b.fill);
|
||||
}
|
||||
// tiny tag bar at top-left
|
||||
if(b.bar){
|
||||
const tw=b.bar.length*9.2+22, th=24;
|
||||
svg.appendChild(el("rect",{x:b.x+18,y:PY+18,width:tw,height:th,fill:PAPER,stroke:b.emph?RUSTD:INK,"stroke-width":2}));
|
||||
txt(b.x+18+tw/2,PY+18+17,b.bar,{w:700,sz:11,ls:".08em",a:"middle",fill:b.emph?RUSTD:RUSTD});
|
||||
}
|
||||
txt(b.x+22,PY+92,b.title,{f:"Bricolage Grotesque",w:800,sz:30,fill:b.emph?RUST:INK});
|
||||
if(b.sub) txt(b.x+22,PY+122,b.sub,{w:700,sz:15,fill:SOFT});
|
||||
});
|
||||
|
||||
// quant chips inside output box (2 x 2 grid so they stay inside the frame)
|
||||
const chips=["q4_k","q5_k","q6_k","q8_0"];
|
||||
const chipW=80, chipH=38, chipGapX=14, chipGapY=12;
|
||||
const chipX0=1040+200, chipY0=PY+30;
|
||||
chips.forEach((c,i)=>{
|
||||
const col=i%2, rowi=Math.floor(i/2);
|
||||
const cx=chipX0+col*(chipW+chipGapX);
|
||||
const cy=chipY0+rowi*(chipH+chipGapY);
|
||||
svg.appendChild(el("rect",{x:cx,y:cy,width:chipW,height:chipH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(cx+chipW/2,cy+25,c,{f:"Bricolage Grotesque",w:800,sz:18,a:"middle"});
|
||||
});
|
||||
|
||||
// pipeline arrows (between boxes). label the download leg.
|
||||
const midY=PY+PH/2;
|
||||
function legArrow(x1,x2,label,dash,color){
|
||||
arrow(x1,midY,x2,midY,color||INK,dash);
|
||||
if(label){
|
||||
const cx=(x1+x2)/2;
|
||||
txt(cx,midY-16,label,{w:700,sz:13,ls:".06em",a:"middle",fill:SOFT});
|
||||
}
|
||||
}
|
||||
legArrow(270,360,"DOWNLOAD","2 8");
|
||||
legArrow(610,700,null);
|
||||
legArrow(950,1040,null,null,RUST);
|
||||
|
||||
// ===================== STATE STRIP (bottom) =====================
|
||||
const SY=360;
|
||||
txt(20,SY-18,"JOB STATUS",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
|
||||
// draw a state pill
|
||||
function pill(x,y,w,h,label,opt){
|
||||
opt=opt||{};
|
||||
svg.appendChild(el("rect",{x:x+5,y:y+5,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill:opt.fill||PAPER2,stroke:opt.stroke||INK,"stroke-width":opt.sw||3,"stroke-dasharray":opt.dash||"none"}));
|
||||
txt(x+w/2,y+h/2+6,label,{f:"Bricolage Grotesque",w:700,sz:16,a:"middle",fill:opt.tcol||INK});
|
||||
}
|
||||
// straight connector with arrowhead
|
||||
function flatArrow(x1,y1,x2,y2,color,dash){
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} L ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=6;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
const states=["queued","downloading","converting","quantizing","completed"];
|
||||
const stH=54, stY=SY+8, gap=44;
|
||||
let stW=[150,190,178,178,168];
|
||||
let sx=20;
|
||||
let centers=[];
|
||||
states.forEach((s,i)=>{
|
||||
const w=stW[i];
|
||||
const isDone=(s==="completed");
|
||||
pill(sx,stY,w,stH,s,{fill:isDone?RUST:PAPER2,tcol:isDone?PAPER:INK,stroke:isDone?RUSTD:INK});
|
||||
centers.push({x:sx,w:w,cx:sx+w/2});
|
||||
sx+=w+gap;
|
||||
});
|
||||
// arrows between states
|
||||
for(let i=0;i<states.length-1;i++){
|
||||
const a=centers[i], b=centers[i+1];
|
||||
flatArrow(a.x+a.w, stY+stH/2, b.x, stY+stH/2, INK);
|
||||
}
|
||||
|
||||
// offshoot states: failed / stopped (cold teal), branching down from "quantizing"
|
||||
const branchFrom = centers[3]; // quantizing
|
||||
const offY = stY+stH+58;
|
||||
const offW=132, offH=44;
|
||||
const failedX = branchFrom.cx-offW-24;
|
||||
const stoppedX = branchFrom.cx+24;
|
||||
pill(failedX,offY,offW,offH,"failed",{fill:PAPER,stroke:COLD,sw:3,tcol:COLD,dash:"4 6"});
|
||||
pill(stoppedX,offY,offW,offH,"stopped",{fill:PAPER,stroke:COLD,sw:3,tcol:COLD,dash:"4 6"});
|
||||
// teal connectors from the running states down to offshoots
|
||||
flatArrow(branchFrom.cx-10, stY+stH, failedX+offW-20, offY, COLD, "4 6");
|
||||
flatArrow(branchFrom.cx+10, stY+stH, stoppedX+20, offY, COLD, "4 6");
|
||||
txt(branchFrom.cx, offY+offH+34, "any running state can fail or be stopped",{w:600,sz:14,a:"middle",fill:COLD});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/quantization-flow.png
vendored
Normal file
|
After Width: | Height: | Size: 209 KiB |
135
docs/static/images/diagrams/quickstart-journey.html
vendored
Normal file
@@ -0,0 +1,135 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Quickstart</div>
|
||||
<h1>Install, run, <em>serve</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">QUICK</div>
|
||||
<div class="s">start</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">From install to your first <b>/v1</b> call in three steps.</div>
|
||||
<div class="url">localai.io<span>/basics/getting_started</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ============ MAIN FLOW STRIP (four steps) ============
|
||||
const SW=300, SH=200, SY=70;
|
||||
const cols=[20, 392, 764, 1136];
|
||||
const steps=[
|
||||
{num:"1", title:"Install", rust:false, lines:["Docker image","macOS DMG","or static binary"]},
|
||||
{num:"2", title:"Start LocalAI", rust:false, lines:["run the container","or the binary","core comes up"]},
|
||||
{num:"3", title:"Pick a model", rust:false, alt:true, lines:["Open the Web UI","— OR —","local-ai run <model>"]},
|
||||
{num:"4", title:"Talk to it", rust:true, alt:true, lines:["Chat in the UI","— OR —","curl /v1/chat/…"]},
|
||||
];
|
||||
|
||||
steps.forEach((s,i)=>{
|
||||
const x=cols[i], y=SY;
|
||||
const fill = s.rust ? PAPER : PAPER2;
|
||||
shadowRect(x,y,SW,SH, fill, INK, 4);
|
||||
// header bar
|
||||
const barFill = s.rust ? RUST : INK;
|
||||
svg.appendChild(el("rect",{x:x,y:y,width:SW,height:54,fill:barFill}));
|
||||
svg.appendChild(el("line",{x1:x,y1:y+54,x2:x+SW,y2:y+54,stroke:INK,"stroke-width":3}));
|
||||
// step number badge
|
||||
txt(x+22,y+38,"STEP "+s.num,{w:700,sz:15,ls:".14em",fill:PAPER});
|
||||
txt(x+SW-22,y+37,s.title,{f:"Bricolage Grotesque",w:800,sz:25,a:"end",fill:PAPER});
|
||||
// body lines
|
||||
let ly=y+102;
|
||||
s.lines.forEach(t=>{
|
||||
const sep = t.indexOf("OR")>=0;
|
||||
txt(x+SW/2, ly, t, {f:"Bricolage Grotesque", w: sep?700:700, sz: sep?17:22, a:"middle", fill: sep?(s.alt?COLD:SOFT):INK});
|
||||
ly += sep?40:46;
|
||||
});
|
||||
});
|
||||
|
||||
// ---- arrows between steps ----
|
||||
for(let i=0;i<3;i++){
|
||||
const c = (i===2) ? RUST : INK;
|
||||
arrow(cols[i]+SW, SY+SH/2, cols[i+1], SY+SH/2, c);
|
||||
}
|
||||
|
||||
// ============ OPTIONAL DOWNWARD BRANCHES (below step 3 & 4) ============
|
||||
const optY=380, optH=110, optW=300;
|
||||
const opts=[
|
||||
{x:cols[2], title:"Agents", sub:"LocalAGI · tools & memory", from:2},
|
||||
{x:cols[3], title:"Distributed mode", sub:"scale across machines", from:3},
|
||||
];
|
||||
|
||||
// "optional" tag
|
||||
txt(cols[2]+SW/2, 345, "OPTIONAL — GO FURTHER", {w:700, sz:14, ls:".18em", a:"middle", fill:DIM});
|
||||
|
||||
opts.forEach(o=>{
|
||||
// dashed connector from parent step bottom down to option box
|
||||
const px = o.x + SW/2;
|
||||
arrow(px, SY+SH, px, optY, COLD, "2 8");
|
||||
// option box (dashed, cold)
|
||||
svg.appendChild(el("rect",{x:o.x+7,y:optY+7,width:optW,height:optH,fill:INK}));
|
||||
svg.appendChild(el("rect",{x:o.x,y:optY,width:optW,height:optH,fill:PAPER,stroke:COLD,"stroke-width":3.5,"stroke-dasharray":"4 7"}));
|
||||
txt(o.x+22, optY+50, o.title, {f:"Bricolage Grotesque", w:800, sz:26, fill:COLD});
|
||||
txt(o.x+22, optY+82, o.sub, {w:700, sz:15, fill:SOFT});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/quickstart-journey.png
vendored
Normal file
|
After Width: | Height: | Size: 216 KiB |
139
docs/static/images/diagrams/realtime-pipeline.html
vendored
Normal file
@@ -0,0 +1,139 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Realtime API</div>
|
||||
<h1>The realtime <em>voice loop</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">WS</div>
|
||||
<div class="s">/ WebRTC</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Voice in, voice out: <b>VAD → STT → LLM → TTS</b>, over WebSocket or WebRTC.</div>
|
||||
<div class="url">localai.io<span>/features/openai-realtime</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ============ PIPELINE ROW ============
|
||||
// Six stages: mic -> VAD -> STT -> LLM -> TTS -> audio out
|
||||
const PY=120, PH=110; // pipeline box top + height
|
||||
const PMID=PY+PH/2;
|
||||
const stages=[
|
||||
{x:24, w:170, n:"Mic audio", s:"caller speaks", fill:PAPER2, edge:false},
|
||||
{x:248, w:170, n:"VAD", s:"speech detect", fill:HI, edge:false},
|
||||
{x:472, w:170, n:"STT", s:"speech to text",fill:HI, edge:false},
|
||||
{x:696, w:170, n:"LLM", s:"reasoning", fill:RUST, edge:true},
|
||||
{x:920, w:170, n:"TTS", s:"text to speech",fill:HI, edge:false},
|
||||
{x:1144, w:170, n:"Audio out", s:"voice reply", fill:PAPER2, edge:false},
|
||||
];
|
||||
txt(24,80,"PIPELINE",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
stages.forEach(st=>{
|
||||
shadowRect(st.x,PY,st.w,PH,st.fill);
|
||||
const tc = st.edge?PAPER:INK;
|
||||
const sc = st.edge?"#F1D9C8":SOFT;
|
||||
txt(st.x+st.w/2,PY+52,st.n,{f:"Bricolage Grotesque",w:800,sz:30,a:"middle",fill:tc});
|
||||
txt(st.x+st.w/2,PY+82,st.s,{w:700,sz:14,a:"middle",fill:sc});
|
||||
});
|
||||
// forward arrows between stages
|
||||
for(let i=0;i<stages.length-1;i++){
|
||||
const a=stages[i], b=stages[i+1];
|
||||
arrow(a.x+a.w, PMID, b.x, PMID, INK);
|
||||
}
|
||||
|
||||
// ============ RETURN LOOP (audio out -> mic, loops back to listener) ============
|
||||
const lastX = stages[5].x+stages[5].w/2;
|
||||
const firstX = stages[0].x+stages[0].w/2;
|
||||
const loopY = PY-58;
|
||||
// path from audio-out top, up, across, down into mic top
|
||||
const lp = `M ${lastX} ${PY} L ${lastX} ${loopY} L ${firstX} ${loopY} L ${firstX} ${PY-11}`;
|
||||
svg.appendChild(el("path",{d:lp,fill:"none",stroke:COLD,"stroke-width":3.5,"stroke-linecap":"round","stroke-linejoin":"round","stroke-dasharray":"2 8"}));
|
||||
// arrowhead pointing down into mic
|
||||
svg.appendChild(el("path",{d:`M ${firstX} ${PY-11} l -7 -11 M ${firstX} ${PY-11} l 7 -11`,fill:"none",stroke:COLD,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
// loop label
|
||||
const llw=216, llx=(lastX+firstX)/2-llw/2;
|
||||
svg.appendChild(el("rect",{x:llx,y:loopY-19,width:llw,height:30,fill:PAPER,stroke:COLD,"stroke-width":2.5}));
|
||||
txt(llx+llw/2,loopY+2,"streamed back to listener",{w:700,sz:14,a:"middle",fill:COLD});
|
||||
|
||||
// ============ TRANSPORT BAND ============
|
||||
const TY=370, TH=120;
|
||||
txt(24,346,"TRANSPORT",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
const trans=[
|
||||
{x:248, w:300, n:"WebSocket", s:"raw PCM frames"},
|
||||
{x:592, w:300, n:"WebRTC", s:"Opus · SDP handshake"},
|
||||
];
|
||||
trans.forEach(t=>{
|
||||
shadowRect(t.x,TY,t.w,TH,PAPER,RUSTD,3.5);
|
||||
txt(t.x+24,TY+52,t.n,{f:"Bricolage Grotesque",w:800,sz:30,fill:RUSTD});
|
||||
txt(t.x+24,TY+84,t.s,{w:700,sz:16,fill:SOFT});
|
||||
});
|
||||
// transport feeds the pipeline entry (VAD box, the entry into processing)
|
||||
const entry = stages[1]; // VAD
|
||||
const entryBottomX = entry.x + entry.w/2;
|
||||
trans.forEach(t=>{
|
||||
arrow(t.x+t.w/2, TY, entryBottomX, PY+PH, RUSTD, "2 8");
|
||||
});
|
||||
// label on the feed
|
||||
txt(entryBottomX+18,(TY+PY+PH)/2+6,"audio in",{w:700,sz:14,fill:RUSTD});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/realtime-pipeline.png
vendored
Normal file
|
After Width: | Height: | Size: 201 KiB |
171
docs/static/images/diagrams/reranker-pipeline.html
vendored
Normal file
@@ -0,0 +1,171 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Reranker</div>
|
||||
<h1>Two-stage <em>retrieval</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">RE</div>
|
||||
<div class="s">rank</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">A fast retriever finds candidates; the cross-encoder reorders them by true relevance.</div>
|
||||
<div class="url">localai.io<span>/features/reranker</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ===== layout columns =====
|
||||
// query | retriever | candidate stack | reranker | results stack | RAG
|
||||
const CY = 280; // vertical center
|
||||
|
||||
// ---------- STAGE LABELS ----------
|
||||
txt(20,30,"STAGE 1 · RECALL",{w:700,sz:14,ls:".2em",fill:COLD});
|
||||
txt(1460,30,"STAGE 2 · PRECISION",{w:700,sz:14,ls:".2em",a:"end",fill:RUSTD});
|
||||
|
||||
// ---------- QUERY ----------
|
||||
const QX=20, QW=140, QH=96, QY=CY-QH/2;
|
||||
shadowRect(QX,QY,QW,QH,PAPER2);
|
||||
txt(QX+QW/2,QY+45,"query",{f:"Bricolage Grotesque",w:800,sz:26,a:"middle"});
|
||||
txt(QX+QW/2,QY+72,"user question",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- RETRIEVER ----------
|
||||
const RX=232, RW=170, RH=130, RY=CY-RH/2;
|
||||
shadowRect(RX,RY,RW,RH,HI,COLD,4);
|
||||
txt(RX+RW/2,RY+48,"retriever",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:COLD});
|
||||
txt(RX+RW/2,RY+78,"embeddings",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
txt(RX+RW/2,RY+100,"vector search",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- CANDIDATE STACK (top-K, unordered) ----------
|
||||
const CSX=474, chW=180, chH=46, chGap=12, nC=5;
|
||||
const stackH = nC*chH + (nC-1)*chGap;
|
||||
let csTop = CY - stackH/2;
|
||||
txt(CSX+chW/2, csTop-16, "top-K candidates", {w:700,sz:14,ls:".04em",a:"middle",fill:SOFT});
|
||||
const candLabels=["doc #14","doc #3","doc #27","doc #8","doc #19"];
|
||||
candLabels.forEach((d,i)=>{
|
||||
const y = csTop + i*(chH+chGap);
|
||||
shadowRect(CSX,y,chW,chH,PAPER,INK,2.5);
|
||||
txt(CSX+16,y+30,d,{f:"Bricolage Grotesque",w:700,sz:18});
|
||||
txt(CSX+chW-14,y+30,"?",{w:700,sz:18,a:"end",fill:DIM});
|
||||
});
|
||||
|
||||
// ---------- CROSS-ENCODER RERANKER ----------
|
||||
const KX=738, KW=200, KH=200, KY=CY-KH/2;
|
||||
shadowRect(KX,KY,KW,KH,RUST,INK,4);
|
||||
txt(KX+KW/2,KY+58,"cross-",{f:"Bricolage Grotesque",w:800,sz:30,a:"middle",fill:PAPER});
|
||||
txt(KX+KW/2,KY+92,"encoder",{f:"Bricolage Grotesque",w:800,sz:30,a:"middle",fill:PAPER});
|
||||
// inner rule
|
||||
svg.appendChild(el("line",{x1:KX+24,y1:KY+118,x2:KX+KW-24,y2:KY+118,stroke:PAPER,"stroke-width":2,"stroke-dasharray":"3 6"}));
|
||||
txt(KX+KW/2,KY+148,"scores each",{w:700,sz:15,a:"middle",fill:"#F1D9C8"});
|
||||
txt(KX+KW/2,KY+170,"query · doc pair",{w:700,sz:15,a:"middle",fill:"#F1D9C8"});
|
||||
|
||||
// ---------- RESULTS STACK (re-ordered, shorter) ----------
|
||||
const OSX=1024, owW=200, owH=58, owGap=14, nO=3;
|
||||
const ostackH = nO*owH + (nO-1)*owGap;
|
||||
let osTop = CY - ostackH/2;
|
||||
txt(OSX+owW/2, osTop-16, "top results", {w:700,sz:14,ls:".04em",a:"middle",fill:RUSTD});
|
||||
const resLabels=[
|
||||
{n:"doc #27", s:"0.98", best:true},
|
||||
{n:"doc #3", s:"0.91", best:false},
|
||||
{n:"doc #14", s:"0.84", best:false},
|
||||
];
|
||||
resLabels.forEach((d,i)=>{
|
||||
const y = osTop + i*(owH+owGap);
|
||||
if(d.best){
|
||||
shadowRect(OSX,y,owW,owH,HI,RUST,4);
|
||||
} else {
|
||||
shadowRect(OSX,y,owW,owH,PAPER,INK,2.5);
|
||||
}
|
||||
txt(OSX+16,y+25,d.n,{f:"Bricolage Grotesque",w:800,sz:20,fill:d.best?RUSTD:INK});
|
||||
txt(OSX+16,y+47,"rank "+(i+1),{w:700,sz:13,fill:SOFT});
|
||||
txt(OSX+owW-14,y+38,d.s,{f:"Bricolage Grotesque",w:800,sz:22,a:"end",fill:d.best?RUST:SOFT});
|
||||
});
|
||||
|
||||
// ---------- RAG / ANSWER ----------
|
||||
const AX=1300, AW=160, AH=110, AY=CY-AH/2;
|
||||
shadowRect(AX,AY,AW,AH,PAPER2,RUST,4);
|
||||
txt(AX+AW/2,AY+48,"into RAG",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:RUSTD});
|
||||
txt(AX+AW/2,AY+76,"grounded answer",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- ARROWS ----------
|
||||
// query -> retriever
|
||||
arrow(QX+QW, CY, RX, CY, INK);
|
||||
// retriever -> candidate stack (fan to each chip)
|
||||
candLabels.forEach((d,i)=>{
|
||||
const y = csTop + i*(chH+chGap) + chH/2;
|
||||
arrow(RX+RW, CY, CSX, y, COLD);
|
||||
});
|
||||
// candidate stack -> reranker (fan in)
|
||||
candLabels.forEach((d,i)=>{
|
||||
const y = csTop + i*(chH+chGap) + chH/2;
|
||||
arrow(CSX+chW, y, KX, CY, RUSTD, "2 8");
|
||||
});
|
||||
// reranker -> results stack (fan out)
|
||||
resLabels.forEach((d,i)=>{
|
||||
const y = osTop + i*(owH+owGap) + owH/2;
|
||||
arrow(KX+KW, CY, OSX, y, RUST);
|
||||
});
|
||||
// results stack -> RAG
|
||||
arrow(OSX+owW, CY, AX, CY, RUST);
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/reranker-pipeline.png
vendored
Normal file
|
After Width: | Height: | Size: 234 KiB |
175
docs/static/images/diagrams/reverse-proxy-tls.html
vendored
Normal file
@@ -0,0 +1,175 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Deployment</div>
|
||||
<h1>TLS at the <em>edge</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">X-FWD</div>
|
||||
<div class="s">headers</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Terminate TLS at the proxy; <b>forwarded headers let LocalAI emit correct https asset URLs.</b></div>
|
||||
<div class="url">localai.io<span>/docs</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ===================================================================
|
||||
// Left-to-right topology: browser -> reverse proxy -> LocalAI
|
||||
// ===================================================================
|
||||
|
||||
// vertical center for the connecting spine
|
||||
const MIDY = 250;
|
||||
|
||||
// ---------- BROWSER (left) ----------
|
||||
const BX=30, BW=300, BH=150, BY=MIDY-BH/2;
|
||||
txt(BX+4,BY-22,"CLIENT",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
shadowRect(BX,BY,BW,BH,PAPER2);
|
||||
// little browser chrome line
|
||||
svg.appendChild(el("line",{x1:BX,y1:BY+44,x2:BX+BW,y2:BY+44,stroke:INK,"stroke-width":2.5}));
|
||||
svg.appendChild(el("circle",{cx:BX+24,cy:BY+24,r:6,fill:COLD}));
|
||||
svg.appendChild(el("circle",{cx:BX+46,cy:BY+24,r:6,fill:DIM}));
|
||||
svg.appendChild(el("circle",{cx:BX+68,cy:BY+24,r:6,fill:DIM}));
|
||||
txt(BX+BW/2,BY+90,"Browser",{f:"Bricolage Grotesque",w:800,sz:30,a:"middle"});
|
||||
txt(BX+BW/2,BY+122,"requests https://host/...",{w:700,sz:15,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- REVERSE PROXY (center) ----------
|
||||
const PX=540, PW=400, PH=300, PY=MIDY-PH/2;
|
||||
txt(PX+4,PY-22,"EDGE",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
shadowRect(PX,PY,PW,PH,PAPER,INK,4);
|
||||
// rust title bar
|
||||
svg.appendChild(el("rect",{x:PX,y:PY,width:PW,height:60,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:PX,y1:PY+60,x2:PX+PW,y2:PY+60,stroke:INK,"stroke-width":4}));
|
||||
txt(PX+24,PY+39,"Reverse proxy",{f:"Bricolage Grotesque",w:800,sz:28,fill:PAPER});
|
||||
txt(PX+PW-24,PY+38,"nginx · caddy · traefik",{w:700,sz:13,ls:".04em",a:"end",fill:"#F1D9C8"});
|
||||
// TLS terminated banner
|
||||
const tlsY=PY+80;
|
||||
svg.appendChild(el("rect",{x:PX+24,y:tlsY,width:PW-48,height:50,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
// lock glyph
|
||||
const lx=PX+44, ly=tlsY+25;
|
||||
svg.appendChild(el("rect",{x:lx-9,y:ly-4,width:18,height:15,fill:RUSTD}));
|
||||
svg.appendChild(el("path",{d:`M ${lx-6} ${ly-4} v -5 a 6 6 0 0 1 12 0 v 5`,fill:"none",stroke:RUSTD,"stroke-width":3}));
|
||||
txt(PX+68,tlsY+33,"TLS terminated here",{f:"Bricolage Grotesque",w:800,sz:21});
|
||||
// injected headers list
|
||||
const hY=tlsY+72;
|
||||
txt(PX+24,hY,"injects forwarded headers:",{w:700,sz:14,fill:SOFT});
|
||||
const hdrs=["X-Forwarded-Proto: https","X-Forwarded-Host","X-Forwarded-Prefix"];
|
||||
hdrs.forEach((h,i)=>{
|
||||
const ry=hY+16+i*40;
|
||||
svg.appendChild(el("rect",{x:PX+24,y:ry,width:PW-48,height:30,fill:PAPER2,stroke:INK,"stroke-width":2}));
|
||||
txt(PX+38,ry+21,h,{f:"Bricolage Grotesque",w:700,sz:17});
|
||||
});
|
||||
|
||||
// ---------- LOCALAI (right) ----------
|
||||
const LX=1150, LW=300, LH=300, LY=MIDY-LH/2;
|
||||
txt(LX+LW,LY-22,"ORIGIN",{w:700,sz:14,ls:".2em",a:"end",fill:SOFT});
|
||||
shadowRect(LX,LY,LW,LH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:LX,y:LY,width:LW,height:60,fill:COLD}));
|
||||
svg.appendChild(el("line",{x1:LX,y1:LY+60,x2:LX+LW,y2:LY+60,stroke:INK,"stroke-width":4}));
|
||||
txt(LX+24,LY+39,"LocalAI",{f:"Bricolage Grotesque",w:800,sz:28,fill:PAPER});
|
||||
// BaseURL middleware box
|
||||
const mwY=LY+80;
|
||||
svg.appendChild(el("rect",{x:LX+24,y:mwY,width:LW-48,height:54,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(LX+40,mwY+24,"BaseURL middleware",{f:"Bricolage Grotesque",w:800,sz:19});
|
||||
txt(LX+40,mwY+44,"reads X-Forwarded-*",{w:700,sz:14,fill:SOFT});
|
||||
// output: https asset URLs
|
||||
const oY=mwY+74;
|
||||
txt(LX+24,oY,"emits asset URLs:",{w:700,sz:14,fill:SOFT});
|
||||
svg.appendChild(el("rect",{x:LX+24,y:oY+14,width:LW-48,height:44,fill:PAPER2,stroke:INK,"stroke-width":2.5}));
|
||||
txt(LX+40,oY+36,"https://host/...",{f:"Bricolage Grotesque",w:800,sz:20,fill:RUSTD});
|
||||
txt(LX+40,oY+54,"correct scheme · host · prefix",{w:700,sz:12,fill:SOFT});
|
||||
txt(LX+24,oY+86,"serves on plain HTTP",{w:700,sz:14,fill:SOFT});
|
||||
|
||||
// ===================================================================
|
||||
// CONNECTORS
|
||||
// ===================================================================
|
||||
// browser -> proxy : HTTPS (solid rust, encrypted leg)
|
||||
arrow(BX+BW, MIDY, PX, MIDY, RUST);
|
||||
// leg label + lock
|
||||
const seg1mid=(BX+BW+PX)/2;
|
||||
svg.appendChild(el("rect",{x:seg1mid-58,y:MIDY-46,width:116,height:34,fill:PAPER,stroke:RUST,"stroke-width":2.5}));
|
||||
// small lock on label
|
||||
const slx=seg1mid-40, sly=MIDY-29;
|
||||
svg.appendChild(el("rect",{x:slx-6,y:sly-2,width:12,height:10,fill:RUST}));
|
||||
svg.appendChild(el("path",{d:`M ${slx-4} ${sly-2} v -3 a 4 4 0 0 1 8 0 v 3`,fill:"none",stroke:RUST,"stroke-width":2.5}));
|
||||
txt(seg1mid+8,MIDY-23,"HTTPS",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",fill:RUSTD});
|
||||
txt(seg1mid,MIDY+34,"encrypted",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// proxy -> LocalAI : HTTP (dashed cold, plaintext internal leg)
|
||||
arrow(PX+PW, MIDY, LX, MIDY, COLD, "2 8");
|
||||
const seg2mid=(PX+PW+LX)/2;
|
||||
svg.appendChild(el("rect",{x:seg2mid-52,y:MIDY-46,width:104,height:34,fill:PAPER,stroke:COLD,"stroke-width":2.5}));
|
||||
txt(seg2mid,MIDY-23,"HTTP",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",fill:COLD});
|
||||
txt(seg2mid,MIDY+34,"internal · no TLS",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- TLS boundary marker (vertical, at proxy right edge) ----------
|
||||
const BND=PX+PW+ (LX-(PX+PW))/2 - 0; // not used; boundary drawn at proxy edge below
|
||||
// boundary line just right of the proxy where TLS ends
|
||||
const bx=PX+PW+18;
|
||||
svg.appendChild(el("line",{x1:bx,y1:LY-6,x2:bx,y2:LY+LH+6,stroke:RUSTD,"stroke-width":3,"stroke-dasharray":"3 8"}));
|
||||
const lbW=150,lbH=30,lbx=bx-lbW/2,lby=LY+LH-6;
|
||||
svg.appendChild(el("rect",{x:lbx,y:lby,width:lbW,height:lbH,fill:PAPER,stroke:RUSTD,"stroke-width":2.5}));
|
||||
txt(bx,lby+21,"TLS ENDS HERE",{f:"Bricolage Grotesque",w:800,sz:14,a:"middle",ls:".03em",fill:RUSTD});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/reverse-proxy-tls.png
vendored
Normal file
|
After Width: | Height: | Size: 217 KiB |
171
docs/static/images/diagrams/smartrouter-scheduling.html
vendored
Normal file
@@ -0,0 +1,171 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> SmartRouter</div>
|
||||
<h1>How the router <em>places a request</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">IDLE</div>
|
||||
<div class="s">first</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Idle-first placement with <b>preemptive least-recently-used eviction.</b></div>
|
||||
<div class="url">localai.io<span>/features/distributed-mode</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// straight arrow with explicit endpoint arrowhead (any direction)
|
||||
function lineArrow(x1,y1,x2,y2,color,dash){
|
||||
svg.appendChild(el("line",{x1,y1,x2,y2,stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const ang=Math.atan2(y2-y1,x2-x1), a=8, sp=0.5;
|
||||
const ax=x2-Math.cos(ang)*2, ay=y2-Math.sin(ang)*2;
|
||||
svg.appendChild(el("path",{d:`M ${ax} ${ay} l ${-(a+5)*Math.cos(ang-sp)} ${-(a+5)*Math.sin(ang-sp)} M ${ax} ${ay} l ${-(a+5)*Math.cos(ang+sp)} ${-(a+5)*Math.sin(ang+sp)}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- DIAMOND (decision) ----------
|
||||
function diamond(cx,cy,hw,hh,lines,o){
|
||||
o=o||{};
|
||||
const pts=`${cx},${cy-hh} ${cx+hw},${cy} ${cx},${cy+hh} ${cx-hw},${cy}`;
|
||||
// hard offset shadow
|
||||
svg.appendChild(el("polygon",{points:`${cx+7},${cy-hh+7} ${cx+hw+7},${cy+7} ${cx+7},${cy+hh+7} ${cx-hw+7},${cy+7}`,fill:INK}));
|
||||
svg.appendChild(el("polygon",{points:pts,fill:o.fill||HI,stroke:INK,"stroke-width":o.sw||3.5}));
|
||||
const n=lines.length, lh=o.lh||21, start=cy-((n-1)*lh)/2+6;
|
||||
lines.forEach((ln,i)=>txt(cx,start+i*lh,ln,{f:"Bricolage Grotesque",w:700,sz:o.sz||17,a:"middle"}));
|
||||
}
|
||||
// ---------- OUTCOME RECT ----------
|
||||
function outcome(cx,cy,w,h,lines,o){
|
||||
o=o||{};
|
||||
shadowRect(cx-w/2,cy-h/2,w,h,o.fill||PAPER2,o.stroke,o.sw);
|
||||
const n=lines.length, lh=o.lh||23, start=cy-((n-1)*lh)/2+7;
|
||||
lines.forEach((ln,i)=>txt(cx,start+i*lh,ln,{f:"Bricolage Grotesque",w:o.w||800,sz:o.sz||19,a:"middle",fill:o.tfill||INK}));
|
||||
}
|
||||
// branch label pill
|
||||
function label(x,y,s,color){
|
||||
const w=s.length*8.5+22, h=24;
|
||||
svg.appendChild(el("rect",{x:x-w/2,y:y-h/2,width:w,height:h,fill:PAPER,stroke:color,"stroke-width":2.5}));
|
||||
txt(x,y+6,s,{w:700,sz:13,ls:".06em",a:"middle",fill:color});
|
||||
}
|
||||
|
||||
// ===== LAYOUT =====
|
||||
// Left column: the decision spine (diamonds). Right side at each level: the YES outcome.
|
||||
const DX=430; // diamond center x
|
||||
const DHW=178, DHH=58; // diamond half-width / half-height
|
||||
const OX=1090; // outcome center x
|
||||
const OW=300, OH=66; // outcome size
|
||||
const rowY=[70,200,330,460]; // first 4 diamonds
|
||||
// vertical positions
|
||||
const dY=[68,196,324,452];
|
||||
|
||||
// diamonds (decisions) - cold teal accent stroke fill
|
||||
const decisions=[
|
||||
["model already","loaded on a node?"],
|
||||
["node with","free VRAM?"],
|
||||
["idle node","available?"],
|
||||
["can evict an LRU node","with zero in-flight?"],
|
||||
];
|
||||
const dCX=DX, dHW=176, dHH=52;
|
||||
const dCY=[88,210,332,454];
|
||||
|
||||
// YES outcomes (right)
|
||||
const yesOut=[
|
||||
{l:["route there","(done)"],fill:"#EFE0BF"},
|
||||
{l:["load there"],fill:"#EFE0BF"},
|
||||
{l:["load there"],fill:"#EFE0BF"},
|
||||
{l:["evict + load"],fill:"#EFE0BF"},
|
||||
];
|
||||
const oW=290, oH=60;
|
||||
|
||||
// bottom row geometry (wait-then-evict + terminal action)
|
||||
const botY=556;
|
||||
const waitCX=dCX, waitCY=botY, waitW=330, waitH=58;
|
||||
const taW=360, taH=64, taCX=OX, taCY=botY;
|
||||
|
||||
// ========== 1) CONNECTORS (drawn first, shapes sit on top) ==========
|
||||
// NO spine: diamond i bottom -> diamond i+1 top
|
||||
for(let i=0;i<3;i++) lineArrow(dCX, dCY[i]+dHH, dCX, dCY[i+1]-dHH, INK);
|
||||
// final NO: last diamond bottom -> wait-then-evict box
|
||||
lineArrow(dCX, dCY[3]+dHH, dCX, waitCY-waitH/2, RUST);
|
||||
// YES connectors: diamond right vertex -> yes outcome left edge
|
||||
for(let i=0;i<4;i++) lineArrow(dCX+dHW, dCY[i], OX-oW/2, dCY[i], i<3?COLD:RUST);
|
||||
// funnel the load/evict outcomes (idx 1,2,3) downward into the terminal action
|
||||
[1,2,3].forEach(i=>{
|
||||
lineArrow(OX, dCY[i]+oH/2, OX, (i===3? taCY-taH/2 : dCY[i+1]-oH/2), COLD, "2 7");
|
||||
});
|
||||
// wait-then-evict -> terminal action
|
||||
lineArrow(waitCX+waitW/2, waitCY, taCX-taW/2, taCY, RUST);
|
||||
|
||||
// ========== 2) SHAPES ==========
|
||||
decisions.forEach((d,i)=> diamond(dCX,dCY[i],dHW,dHH,d,{fill:i<3?HI:"#E9D2B0",sz:18}) );
|
||||
yesOut.forEach((o,i)=> outcome(OX,dCY[i],oW,oH,o.l,{fill:o.fill,sz:21}) );
|
||||
outcome(waitCX,waitCY,waitW,waitH,["wait, then evict"],{fill:PAPER,stroke:RUST,sw:3.5,sz:21,tfill:RUSTD});
|
||||
outcome(taCX,taCY,taW,taH,["backend.install","+ LoadModel"],{fill:RUST,sz:22,tfill:PAPER,lh:25});
|
||||
|
||||
// ========== 3) BRANCH LABELS (on top) ==========
|
||||
for(let i=0;i<3;i++) label(dCX, (dCY[i]+dHH+dCY[i+1]-dHH)/2, "NO", RUSTD);
|
||||
label(dCX, (dCY[3]+dHH+waitCY-waitH/2)/2, "NO", RUST);
|
||||
for(let i=0;i<4;i++) label((dCX+dHW+OX-oW/2)/2, dCY[i]-16, "YES", i<3?COLD:RUST);
|
||||
|
||||
// request tag (left of the first diamond, clear of the spine)
|
||||
txt(dCX-dHW-14, dCY[0]+5, "REQUEST", {w:700,sz:13,ls:".16em",a:"end",fill:SOFT});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/smartrouter-scheduling.png
vendored
Normal file
|
After Width: | Height: | Size: 294 KiB |
142
docs/static/images/diagrams/tool-call-parsers.html
vendored
Normal file
@@ -0,0 +1,142 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Function calling</div>
|
||||
<h1>Same request, <em>any backend</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">TOOLS</div>
|
||||
<div class="s">native</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">One tool-call request shape; <b>each backend's native parser extracts the calls.</b></div>
|
||||
<div class="url">localai.io<span>/features/openai-functions</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// ---------- REQUEST (far left) ----------
|
||||
txt(20,40,"REQUEST",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
const RQX=20, RQY=185, RQW=222, RQH=180;
|
||||
shadowRect(RQX,RQY,RQW,RQH,PAPER2);
|
||||
txt(RQX+RQW/2,RQY+44,"OpenAI-shaped",{f:"Bricolage Grotesque",w:800,sz:23,a:"middle"});
|
||||
txt(RQX+RQW/2,RQY+70,"chat request",{f:"Bricolage Grotesque",w:800,sz:23,a:"middle"});
|
||||
// tools chip
|
||||
const tcW=148,tcH=30,tcx=RQX+(RQW-tcW)/2,tcy=RQY+96;
|
||||
svg.appendChild(el("rect",{x:tcx,y:tcy,width:tcW,height:tcH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(tcx+tcW/2,tcy+21,"tools: [ ... ]",{f:"Bricolage Grotesque",w:800,sz:16,a:"middle"});
|
||||
txt(RQX+RQW/2,RQY+154,"tool_choice: auto",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- LocalAI extraction (center) ----------
|
||||
const EXX=312, EXY=175, EXW=250, EXH=200;
|
||||
shadowRect(EXX,EXY,EXW,EXH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:EXX,y:EXY,width:EXW,height:60,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:EXX,y1:EXY+60,x2:EXX+EXW,y2:EXY+60,stroke:INK,"stroke-width":4}));
|
||||
txt(EXX+EXW/2,EXY+38,"LocalAI",{f:"Bricolage Grotesque",w:800,sz:28,a:"middle",fill:PAPER});
|
||||
txt(EXX+EXW/2,EXY+98,"tool-call",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(EXX+EXW/2,EXY+126,"extraction",{f:"Bricolage Grotesque",w:800,sz:24,a:"middle"});
|
||||
txt(EXX+EXW/2,EXY+162,"picks the right parser",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- PARSERS (per backend, 3 stacked) ----------
|
||||
txt(640,40,"NATIVE PARSERS",{w:700,sz:14,ls:".2em",fill:SOFT});
|
||||
const PX=628, PW=290, PH=120, pRows=[70,222,374];
|
||||
const parsers=[
|
||||
{n:"llama.cpp", s:"C++ autoparser"},
|
||||
{n:"vLLM", s:"ToolParserManager"},
|
||||
{n:"MLX", s:"template auto-detect"},
|
||||
];
|
||||
parsers.forEach((p,i)=>{
|
||||
const y=pRows[i];
|
||||
shadowRect(PX,y,PW,PH,"#EFE0BF");
|
||||
txt(PX+22,y+50,p.n,{f:"Bricolage Grotesque",w:800,sz:26});
|
||||
txt(PX+22,y+82,p.s,{w:700,sz:16,fill:SOFT});
|
||||
});
|
||||
|
||||
// ---------- RESPONSE (far right) ----------
|
||||
txt(1460,40,"RESPONSE",{w:700,sz:14,ls:".2em",a:"end",fill:SOFT});
|
||||
const RSX=1058, RSY=185, RSW=222, RSH=180;
|
||||
shadowRect(RSX,RSY,RSW,RSH,PAPER,RUST,4);
|
||||
txt(RSX+RSW/2,RSY+44,"Uniform",{f:"Bricolage Grotesque",w:800,sz:23,a:"middle"});
|
||||
txt(RSX+RSW/2,RSY+70,"response",{f:"Bricolage Grotesque",w:800,sz:23,a:"middle"});
|
||||
const rcW=170,rcH=30,rcx=RSX+(RSW-rcW)/2,rcy=RSY+96;
|
||||
svg.appendChild(el("rect",{x:rcx,y:rcy,width:rcW,height:rcH,fill:HI,stroke:INK,"stroke-width":2.5}));
|
||||
txt(rcx+rcW/2,rcy+21,"tool_calls: [ ... ]",{f:"Bricolage Grotesque",w:800,sz:15,a:"middle"});
|
||||
txt(RSX+RSW/2,RSY+154,"identical for every backend",{w:700,sz:12.5,a:"middle",fill:SOFT});
|
||||
|
||||
// ---------- ARROWS ----------
|
||||
// request -> extraction
|
||||
arrow(RQX+RQW, RQY+RQH/2, EXX, EXY+EXH/2, INK);
|
||||
// extraction -> parsers (fan out)
|
||||
const exMid=EXY+EXH/2;
|
||||
parsers.forEach((p,i)=>{
|
||||
const y=pRows[i]+PH/2;
|
||||
arrow(EXX+EXW, exMid+(i-1)*46, PX, y, RUSTD);
|
||||
});
|
||||
// parsers -> response (converge)
|
||||
const rsMid=RSY+RSH/2;
|
||||
parsers.forEach((p,i)=>{
|
||||
const y=pRows[i]+PH/2;
|
||||
arrow(PX+PW, y, RSX, rsMid+(i-1)*46, RUSTD);
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/tool-call-parsers.png
vendored
Normal file
|
After Width: | Height: | Size: 231 KiB |
158
docs/static/images/diagrams/voice-recognition-flow.html
vendored
Normal file
@@ -0,0 +1,158 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> Voice Recognition</div>
|
||||
<h1>Register, identify, <em>forget</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">1:N</div>
|
||||
<div class="s">match</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Voiceprints in a vector store: <b>1:1 verify, or 1:N identify.</b></div>
|
||||
<div class="url">localai.io<span>/features/voice-recognition</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// Reusable step box helper
|
||||
function stepBox(x,y,w,h,fill,title,sub,titleColor){
|
||||
shadowRect(x,y,w,h,fill);
|
||||
if(sub){
|
||||
txt(x+w/2,y+h/2-2,title,{f:"Bricolage Grotesque",w:800,sz:21,a:"middle",fill:titleColor||INK});
|
||||
txt(x+w/2,y+h/2+20,sub,{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
} else {
|
||||
txt(x+w/2,y+h/2+7,title,{f:"Bricolage Grotesque",w:800,sz:21,a:"middle",fill:titleColor||INK});
|
||||
}
|
||||
}
|
||||
|
||||
// ===================== CENTRAL VECTOR STORE =====================
|
||||
const VSX=590, VSW=300, VSY=200, VSH=160;
|
||||
shadowRect(VSX,VSY,VSW,VSH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:VSX,y:VSY,width:VSW,height:54,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:VSX,y1:VSY+54,x2:VSX+VSW,y2:VSY+54,stroke:INK,"stroke-width":4}));
|
||||
txt(VSX+VSW/2,VSY+36,"Vector store",{f:"Bricolage Grotesque",w:800,sz:26,a:"middle",fill:PAPER});
|
||||
// voiceprint rows
|
||||
const rows=["alice · [0.12, -0.4 …]","bob · [-0.9, 0.3 …]","carol · [0.5, 0.07 …]"];
|
||||
let ry=VSY+74;
|
||||
rows.forEach(r=>{
|
||||
svg.appendChild(el("rect",{x:VSX+18,y:ry,width:VSW-36,height:26,fill:HI,stroke:INK,"stroke-width":2}));
|
||||
txt(VSX+30,ry+18,r,{f:"Bricolage Grotesque",w:700,sz:14,fill:INK});
|
||||
ry+=30;
|
||||
});
|
||||
txt(VSX+VSW/2,VSY+VSH+24,"voiceprints (embeddings)",{w:700,sz:13,a:"middle",fill:SOFT});
|
||||
|
||||
// ===================== REGISTER (top row) =====================
|
||||
txt(40,46,"REGISTER",{w:700,sz:15,ls:".2em",fill:RUSTD});
|
||||
const rH=58, rY=64;
|
||||
stepBox(40,rY,180,rH,PAPER2,"audio","enrollment clip");
|
||||
stepBox(300,rY,180,rH,HI,"embedding","speaker model",COLD);
|
||||
arrow(220,rY+rH/2,300,rY+rH/2,INK);
|
||||
// embedding -> store (into top of vector store, landing left of title)
|
||||
arrow(480,rY+rH/2,VSX+70,VSY,RUST);
|
||||
txt(528,rY+rH/2-8,"store",{w:700,sz:14,fill:RUSTD});
|
||||
|
||||
// ===================== IDENTIFY (bottom flow, left to right) =====================
|
||||
txt(40,432,"IDENTIFY",{w:700,sz:15,ls:".2em",fill:RUSTD});
|
||||
const iH=66, iY=448;
|
||||
stepBox(40,iY,178,iH,PAPER2,"probe audio","unknown");
|
||||
stepBox(258,iY,178,iH,HI,"embedding","speaker model",COLD);
|
||||
stepBox(700,iY,210,iH,PAPER,"top-K cosine","nearest match",RUST);
|
||||
stepBox(960,iY,178,iH,PAPER2,"speaker","identity + score");
|
||||
arrow(218,iY+iH/2,258,iY+iH/2,INK);
|
||||
// embedding up into store
|
||||
arrow(436+30,iY+iH/2,VSX+90,VSY+VSH,COLD,"2 8");
|
||||
txt(490,iY+10,"query",{w:700,sz:13,fill:COLD});
|
||||
// store down to top-K match
|
||||
arrow(VSX+VSW-50,VSY+VSH,700+10,iY+iH/2,RUST);
|
||||
txt(640,iY-6,"candidates",{w:700,sz:13,a:"middle",fill:RUSTD});
|
||||
arrow(910,iY+iH/2,960,iY+iH/2,INK);
|
||||
|
||||
// ===================== FORGET (right side) =====================
|
||||
txt(1238,46,"FORGET",{w:700,sz:15,ls:".2em",a:"start",fill:RUSTD});
|
||||
const fX=1190, fW=250, fY=64, fH=110;
|
||||
svg.appendChild(el("rect",{x:fX,y:fY,width:fW,height:fH,fill:PAPER,stroke:DIM,"stroke-width":3.5,"stroke-dasharray":"4 7"}));
|
||||
txt(fX+fW/2,fY+44,"remove entry",{f:"Bricolage Grotesque",w:800,sz:22,a:"middle",fill:SOFT});
|
||||
txt(fX+fW/2,fY+72,"delete a voiceprint",{w:700,sz:14,a:"middle",fill:DIM});
|
||||
txt(fX+fW/2,fY+95,"DELETE /forget",{w:700,sz:13,a:"middle",ls:".06em",fill:RUSTD});
|
||||
// forget -> store (dashed, removing)
|
||||
arrow(fX,fY+fH/2,VSX+VSW,VSY+30,DIM,"2 8");
|
||||
|
||||
// ===================== LEGEND (corner): verify vs identify =====================
|
||||
const lgX=1130, lgY=300, lgW=310, lgH=210;
|
||||
svg.appendChild(el("rect",{x:lgX,y:lgY,width:lgW,height:lgH,fill:PAPER2,stroke:INK,"stroke-width":3}));
|
||||
txt(lgX+lgW/2,lgY+34,"VERIFY vs IDENTIFY",{f:"Bricolage Grotesque",w:800,sz:18,a:"middle",ls:".04em",fill:INK});
|
||||
svg.appendChild(el("line",{x1:lgX+16,y1:lgY+48,x2:lgX+lgW-16,y2:lgY+48,stroke:INK,"stroke-width":2}));
|
||||
// verify (1:1) - cold
|
||||
svg.appendChild(el("rect",{x:lgX+20,y:lgY+66,width:34,height:34,fill:COLD}));
|
||||
txt(lgX+37,lgY+89,"1:1",{f:"Bricolage Grotesque",w:800,sz:14,a:"middle",fill:PAPER});
|
||||
txt(lgX+66,lgY+82,"verify",{f:"Bricolage Grotesque",w:800,sz:18,fill:COLD});
|
||||
txt(lgX+66,lgY+101,"is this the claimed person?",{w:600,sz:13,fill:SOFT});
|
||||
// identify (1:N) - rust
|
||||
svg.appendChild(el("rect",{x:lgX+20,y:lgY+128,width:34,height:34,fill:RUST}));
|
||||
txt(lgX+37,lgY+151,"1:N",{f:"Bricolage Grotesque",w:800,sz:13,a:"middle",fill:PAPER});
|
||||
txt(lgX+66,lgY+144,"identify",{f:"Bricolage Grotesque",w:800,sz:18,fill:RUST});
|
||||
txt(lgX+66,lgY+163,"who is this, out of N?",{w:600,sz:13,fill:SOFT});
|
||||
txt(lgX+20,lgY+193,"both search the same store.",{w:700,sz:13,fill:INK});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/voice-recognition-flow.png
vendored
Normal file
|
After Width: | Height: | Size: 261 KiB |
197
docs/static/images/diagrams/vram-eviction.html
vendored
Normal file
@@ -0,0 +1,197 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Bricolage+Grotesque:opsz,wght@12..96,600;12..96,700;12..96,800&family=Archivo:wght@500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root{
|
||||
--paper:#F3E8D2; --paper2:#ECDFC2; --ink:#211C14; --ink-soft:#5A5142;
|
||||
--rust:#B43A2C; --rust-deep:#8F2C20; --cold:#3F6E73; --hi:#E7D6AE; --dim:#A99F88;
|
||||
}
|
||||
*{box-sizing:border-box;margin:0;padding:0}
|
||||
html,body{width:1600px;height:900px}
|
||||
body{
|
||||
background:var(--paper);color:var(--ink);font-family:"Archivo",sans-serif;
|
||||
position:relative;overflow:hidden;
|
||||
background-image:
|
||||
linear-gradient(var(--paper2) 1px,transparent 1px),
|
||||
linear-gradient(90deg,var(--paper2) 1px,transparent 1px);
|
||||
background-size:40px 40px;
|
||||
}
|
||||
.frame{position:absolute;inset:26px;border:3px solid var(--ink);}
|
||||
.wrap{position:absolute;inset:26px;padding:30px 56px 26px;display:flex;flex-direction:column}
|
||||
header{display:flex;align-items:flex-end;justify-content:space-between;gap:30px}
|
||||
.eyebrow{font-weight:700;letter-spacing:.22em;text-transform:uppercase;font-size:17px;color:var(--rust-deep)}
|
||||
.eyebrow b{color:var(--ink)}
|
||||
h1{font-family:"Bricolage Grotesque",sans-serif;font-weight:800;font-size:50px;line-height:.98;letter-spacing:-.015em;margin-top:6px}
|
||||
h1 em{font-style:normal;color:var(--rust)}
|
||||
.stamp{border:3px solid var(--ink);padding:10px 16px 8px;transform:rotate(3deg);text-align:center;background:var(--paper);box-shadow:6px 6px 0 var(--ink);flex:none}
|
||||
.stamp .k{font-family:"Bricolage Grotesque";font-weight:800;font-size:21px;letter-spacing:.04em;line-height:1.05}
|
||||
.stamp .s{font-weight:700;font-size:11px;letter-spacing:.18em;text-transform:uppercase;color:var(--ink-soft);margin-top:5px}
|
||||
.stage{flex:1;margin-top:8px}
|
||||
svg{width:100%;height:100%;overflow:visible}
|
||||
footer{display:flex;align-items:center;justify-content:space-between;margin-top:6px;gap:24px}
|
||||
.note{font-weight:600;font-size:18px;color:var(--ink-soft);line-height:1.3;max-width:1080px}
|
||||
.note b{color:var(--ink)}
|
||||
.url{font-family:"Bricolage Grotesque";font-weight:800;font-size:22px;color:var(--rust-deep);letter-spacing:.01em;flex:none}
|
||||
.url span{color:var(--ink)}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="frame"></div>
|
||||
<div class="wrap">
|
||||
<header>
|
||||
<div>
|
||||
<div class="eyebrow">LocalAI <b>·</b> VRAM</div>
|
||||
<h1>Load, evict, <em>reuse</em></h1>
|
||||
</div>
|
||||
<div class="stamp">
|
||||
<div class="k">LRU</div>
|
||||
<div class="s">evict</div>
|
||||
</div>
|
||||
</header>
|
||||
<div class="stage"><svg viewBox="0 0 1480 560" id="svg"></svg></div>
|
||||
<footer>
|
||||
<div class="note">Least-recently-used eviction keeps the hottest models warm within your VRAM budget.</div>
|
||||
<div class="url">localai.io<span>/advanced/vram-management</span></div>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
const INK="#211C14", PAPER="#F3E8D2", PAPER2="#ECDFC2", HI="#E7D6AE", SOFT="#5A5142", RUST="#B43A2C", RUSTD="#8F2C20", COLD="#3F6E73", DIM="#A99F88";
|
||||
function el(t,a,x){const e=document.createElementNS("http://www.w3.org/2000/svg",t);for(const k in a)e.setAttribute(k,a[k]);if(x!=null)e.textContent=x;return e;}
|
||||
const svg=document.getElementById("svg");
|
||||
function shadowRect(x,y,w,h,fill,stroke,sw,dash){
|
||||
svg.appendChild(el("rect",{x:x+7,y:y+7,width:w,height:h,fill:INK}));
|
||||
svg.appendChild(el("rect",{x,y,width:w,height:h,fill,stroke:stroke||INK,"stroke-width":sw||3.5,"stroke-dasharray":dash||"none"}));
|
||||
}
|
||||
function txt(x,y,s,o){o=o||{};svg.appendChild(el("text",{x,y,"font-family":o.f||"Archivo","font-weight":o.w||700,"font-size":o.sz||15,"letter-spacing":o.ls||"0","text-anchor":o.a||"start",fill:o.fill||INK},s));}
|
||||
function arrow(x1,y1,x2,y2,color,dash){
|
||||
const mx=(x1+x2)/2;
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y1} C ${mx} ${y1}, ${mx} ${y2}, ${x2-11} ${y2}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round","stroke-dasharray":dash||"none"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y2} l -${a+4} -${a} M ${x2-11} ${y2} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
// horizontal flat arrow between timeline steps
|
||||
function flowArrow(x1,x2,y,color){
|
||||
svg.appendChild(el("path",{d:`M ${x1} ${y} L ${x2-11} ${y}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
const a=7;
|
||||
svg.appendChild(el("path",{d:`M ${x2-11} ${y} l -${a+4} -${a} M ${x2-11} ${y} l -${a+4} ${a}`,fill:"none",stroke:color,"stroke-width":3.5,"stroke-linecap":"round"}));
|
||||
}
|
||||
|
||||
// a 2-slot VRAM stack drawn at (x,y), slot[i] = {n} or null. evict highlights a slot in rust.
|
||||
function vramStack(x,y,slots,opt){
|
||||
opt=opt||{};
|
||||
const sw=160, slotH=46, gap=10, padTop=8;
|
||||
const h=padTop*2 + slotH*2 + gap;
|
||||
// outer budget box
|
||||
shadowRect(x,y,sw,h,PAPER,INK,3.5);
|
||||
for(let i=0;i<2;i++){
|
||||
const sy=y+padTop+i*(slotH+gap);
|
||||
const filled=slots[i];
|
||||
const evicting = opt.evict===i;
|
||||
const fresh = opt.fresh===i;
|
||||
let fill = filled ? (evicting?RUST:(fresh?COLD:HI)) : PAPER2;
|
||||
svg.appendChild(el("rect",{x:x+8,y:sy,width:sw-16,height:slotH,fill,stroke:INK,"stroke-width":2.5,"stroke-dasharray":filled?"none":"3 6"}));
|
||||
if(filled){
|
||||
const lab=(evicting||fresh)?PAPER:INK;
|
||||
txt(x+sw/2,sy+30,slots[i],{f:"Bricolage Grotesque",w:800,sz:24,a:"middle",fill:lab});
|
||||
} else {
|
||||
txt(x+sw/2,sy+30,"free",{w:700,sz:14,a:"middle",fill:DIM});
|
||||
}
|
||||
}
|
||||
return {w:sw,h:h};
|
||||
}
|
||||
|
||||
// ===================== LEFT PANEL =====================
|
||||
const LX=20, LY=18, LW=700, LH=524;
|
||||
shadowRect(LX,LY,LW,LH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:LX,y:LY,width:LW,height:58,fill:RUST}));
|
||||
svg.appendChild(el("line",{x1:LX,y1:LY+58,x2:LX+LW,y2:LY+58,stroke:INK,"stroke-width":4}));
|
||||
txt(LX+24,LY+38,"LRU eviction",{f:"Bricolage Grotesque",w:800,sz:27,fill:PAPER});
|
||||
txt(LX+LW-24,LY+37,"max = 2 slots",{w:700,sz:15,ls:".04em",a:"end",fill:"#F1D9C8"});
|
||||
|
||||
// four timeline steps, each a VRAM stack + caption
|
||||
const stackY=LY+130;
|
||||
const sx=[LX+30, LX+205, LX+380, LX+555];
|
||||
const steps=[
|
||||
{slots:["A",null], cap:["load A","slot 1 filled"]},
|
||||
{slots:["A","B"], cap:["load B","both full"]},
|
||||
{slots:["C","B"], evict:0, cap:["request C","evict LRU A → C"]},
|
||||
{slots:["C","B"], fresh:1, cap:["request B","refresh B"]},
|
||||
];
|
||||
steps.forEach((st,i)=>{
|
||||
vramStack(sx[i],stackY,st.slots,{evict:st.evict,fresh:st.fresh});
|
||||
});
|
||||
// connectors between stacks
|
||||
for(let i=0;i<3;i++){
|
||||
flowArrow(sx[i]+160, sx[i+1], stackY+62, i===2?COLD:INK);
|
||||
}
|
||||
// captions under each
|
||||
steps.forEach((st,i)=>{
|
||||
txt(sx[i]+80, stackY+158, st.cap[0], {f:"Bricolage Grotesque",w:800,sz:18,a:"middle"});
|
||||
txt(sx[i]+80, stackY+182, st.cap[1], {w:700,sz:13,a:"middle",fill:SOFT});
|
||||
});
|
||||
// time axis
|
||||
txt(LX+30, stackY+232, "TIME",{w:700,sz:12,ls:".2em",fill:SOFT});
|
||||
svg.appendChild(el("line",{x1:LX+90,y1:stackY+227,x2:LX+LW-30,y2:stackY+227,stroke:DIM,"stroke-width":2.5,"stroke-dasharray":"2 7"}));
|
||||
// legend
|
||||
const lgY=stackY+268;
|
||||
function chip(cx,fill,dash){svg.appendChild(el("rect",{x:cx,y:lgY-13,width:20,height:18,fill,stroke:INK,"stroke-width":2,"stroke-dasharray":dash||"none"}));}
|
||||
chip(LX+30,HI); txt(LX+58,lgY+2,"resident",{w:700,sz:13,fill:SOFT});
|
||||
chip(LX+170,RUST); txt(LX+198,lgY+2,"evicted (LRU)",{w:700,sz:13,fill:SOFT});
|
||||
chip(LX+330,COLD); txt(LX+358,lgY+2,"refreshed",{w:700,sz:13,fill:SOFT});
|
||||
chip(LX+470,PAPER2,"3 6"); txt(LX+498,lgY+2,"free slot",{w:700,sz:13,fill:SOFT});
|
||||
|
||||
// ===================== RIGHT PANEL =====================
|
||||
const RX=760, RY=18, RW=700, RH=524;
|
||||
shadowRect(RX,RY,RW,RH,PAPER,INK,4);
|
||||
svg.appendChild(el("rect",{x:RX,y:RY,width:RW,height:58,fill:COLD}));
|
||||
svg.appendChild(el("line",{x1:RX,y1:RY+58,x2:RX+RW,y2:RY+58,stroke:INK,"stroke-width":4}));
|
||||
txt(RX+24,RY+38,"Concurrency group anti-affinity",{f:"Bricolage Grotesque",w:800,sz:25,fill:PAPER});
|
||||
|
||||
// two GPU states: before / after
|
||||
function gpuBox(x,y,w,h,title){
|
||||
shadowRect(x,y,w,h,PAPER,INK,3.5);
|
||||
svg.appendChild(el("rect",{x:x,y:y,width:w,height:34,fill:HI}));
|
||||
svg.appendChild(el("line",{x1:x,y1:y+34,x2:x+w,y2:y+34,stroke:INK,"stroke-width":2.5}));
|
||||
txt(x+14,y+24,title,{f:"Bricolage Grotesque",w:800,sz:17});
|
||||
}
|
||||
// model slot inside gpu
|
||||
function modelSlot(x,y,w,name,grp,state){
|
||||
// state: keep | evict | new
|
||||
let fill = state==="evict"?RUST : state==="new"?COLD : HI;
|
||||
let lab = (state==="evict"||state==="new")?PAPER:INK;
|
||||
shadowRect(x,y,w,52,fill,INK,2.5);
|
||||
txt(x+16,y+25,name,{f:"Bricolage Grotesque",w:800,sz:19,fill:lab});
|
||||
txt(x+16,y+44,grp,{w:700,sz:12,fill:(state==="evict"||state==="new")?"#F1D9C8":SOFT});
|
||||
}
|
||||
|
||||
const gpW=300, gpH=300, gpY=RY+118;
|
||||
const gpBX=RX+30, gpAX=RX+RW-30-gpW;
|
||||
gpuBox(gpBX,gpY,gpW,gpH,"before");
|
||||
gpuBox(gpAX,gpY,gpW,gpH,"loading 120b-b");
|
||||
|
||||
// before: zed-predict + 120b-a coexist
|
||||
modelSlot(gpBX+18,gpY+56,gpW-36,"zed-predict","group: tools","keep");
|
||||
modelSlot(gpBX+18,gpY+128,gpW-36,"120b-a","group: chat","keep");
|
||||
txt(gpBX+gpW/2,gpY+232,"different groups",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
txt(gpBX+gpW/2,gpY+256,"→ both stay resident",{f:"Bricolage Grotesque",w:700,sz:17,a:"middle"});
|
||||
|
||||
// after: 120b-b evicts 120b-a, zed-predict stays
|
||||
modelSlot(gpAX+18,gpY+56,gpW-36,"zed-predict","group: tools · kept","keep");
|
||||
modelSlot(gpAX+18,gpY+128,gpW-36,"120b-b","group: chat · loaded","new");
|
||||
txt(gpAX+gpW/2,gpY+232,"same group as 120b-a",{w:700,sz:14,a:"middle",fill:SOFT});
|
||||
txt(gpAX+gpW/2,gpY+256,"→ 120b-a evicted",{f:"Bricolage Grotesque",w:700,sz:17,a:"middle",fill:RUSTD});
|
||||
|
||||
// arrow between the two gpu states
|
||||
flowArrow(gpBX+gpW, gpAX, gpY+gpH/2, COLD);
|
||||
|
||||
// caption strip at bottom of right panel
|
||||
const csY=RY+RH-58;
|
||||
svg.appendChild(el("line",{x1:RX+24,y1:csY-18,x2:RX+RW-24,y2:csY-18,stroke:DIM,"stroke-width":2,"stroke-dasharray":"2 7"}));
|
||||
txt(RX+RW/2, csY+6, "anti-affinity evicts only within the same concurrency group", {w:700,sz:14,a:"middle",fill:SOFT});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/static/images/diagrams/vram-eviction.png
vendored
Normal file
|
After Width: | Height: | Size: 234 KiB |