Files
LocalAI/backend/go/cloud-proxy/main.go
Richard Palethorpe 6a80e23733 feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802)
Add a routing middleware stack and a cloud-proxy backend.

* cloud-proxy: a Go gRPC backend that forwards OpenAI- and
  Anthropic-shaped chat requests to upstream providers, with an
  optional translate mode (OpenAI request -> Anthropic /v1/messages
  -> OpenAI response) and full tool-calling support.

* routing: admission control, content-aware model routing
  (embedding cache + classifier + rerank + Arch-Router score),
  PII detection/redaction (regex + NER) with streaming filter and
  OpenAI/Anthropic adapters, and a per-user/per-key billing recorder
  backed by GORM or in-memory storage.

* middleware: UsageMiddleware records usage via the billing recorder,
  plus admission, route-model, usage-stamp and trace middlewares.

* observability: BackendTrace ring buffer stores full request bodies
  (capped), MITM proxy emits structured trace events, and router
  classifier decisions surface at /api/router/decide.

* gallery: Arch-Router-1.5B (Q4_K_M and Q8_0).

* UI: cloud-proxy model-editor fields, classifier system-prompt and
  score-normalization config, and a Traces page rendering request
  bodies.

Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-05-25 09:28:27 +02:00

40 lines
1.3 KiB
Go
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
package main
// cloud-proxy is a LocalAI backend that forwards request traffic to an
// external HTTP provider (OpenAI, Anthropic, etc.). Two modes:
//
// - passthrough: serves the Forward RPC; the client wire format is
// preserved end-to-end, no translation.
// - translate: serves Predict/PredictStream; the backend converts
// internal proto to the provider's wire format. (Phases 56.)
//
// LoadModel reads UpstreamURL/Mode/Provider/key references from
// ProxyOptions and resolves the API key once at load time.
import (
"flag"
"os"
grpc "github.com/mudler/LocalAI/pkg/grpc"
"github.com/mudler/xlog"
"golang.org/x/term"
)
var addr = flag.String("addr", "localhost:50051", "the address to listen on")
func main() {
// xlog's default handler emits ANSI color codes; that's fine for an
// interactive shell but unreadable when the backend's stdout is
// captured by LocalAI and tee'd to a log file. Force plain text when
// LOCALAI_LOG_FORMAT is unset and stdout isn't a terminal.
format := os.Getenv("LOCALAI_LOG_FORMAT")
if format == "" && !term.IsTerminal(int(os.Stdout.Fd())) {
format = xlog.TextFormat
}
xlog.SetLogger(xlog.NewLogger(xlog.LogLevel(os.Getenv("LOCALAI_LOG_LEVEL")), format))
flag.Parse()
if err := grpc.StartServer(*addr, NewCloudProxy()); err != nil {
panic(err)
}
}