mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-30 11:36:31 -04:00
Add a routing middleware stack and a cloud-proxy backend. * cloud-proxy: a Go gRPC backend that forwards OpenAI- and Anthropic-shaped chat requests to upstream providers, with an optional translate mode (OpenAI request -> Anthropic /v1/messages -> OpenAI response) and full tool-calling support. * routing: admission control, content-aware model routing (embedding cache + classifier + rerank + Arch-Router score), PII detection/redaction (regex + NER) with streaming filter and OpenAI/Anthropic adapters, and a per-user/per-key billing recorder backed by GORM or in-memory storage. * middleware: UsageMiddleware records usage via the billing recorder, plus admission, route-model, usage-stamp and trace middlewares. * observability: BackendTrace ring buffer stores full request bodies (capped), MITM proxy emits structured trace events, and router classifier decisions surface at /api/router/decide. * gallery: Arch-Router-1.5B (Q4_K_M and Q8_0). * UI: cloud-proxy model-editor fields, classifier system-prompt and score-normalization config, and a Traces page rendering request bodies. Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash] Signed-off-by: Richard Palethorpe <io@richiejp.com>
160 lines
6.7 KiB
Markdown
160 lines
6.7 KiB
Markdown
+++
|
|
title = "MITM proxy for Claude Code / Codex CLI"
|
|
weight = 29
|
|
toc = true
|
|
description = "Redact PII from cloud-AI traffic without LocalAI holding API keys"
|
|
tags = ["Proxy", "MITM", "Privacy", "Routing", "Advanced"]
|
|
categories = ["Features"]
|
|
+++
|
|
|
|
LocalAI can act as a local HTTPS proxy that **redacts PII from your Claude
|
|
Code, OpenAI Codex CLI, or any HTTPS client** without holding their API keys.
|
|
The proxy intercepts only the LLM API endpoints you allowlist (default:
|
|
`api.anthropic.com`, `api.openai.com`); everything else — OAuth, telemetry,
|
|
package fetches — passes through as a plain TCP tunnel.
|
|
|
|
Use this when:
|
|
|
|
- You want to use **Claude Code with a Claude Pro/Max subscription** but still
|
|
apply the same PII redaction LocalAI applies to API-key traffic.
|
|
- You run Codex CLI on a corporate laptop and need an audit trail of prompts.
|
|
- You want LocalAI to enforce egress policies for AI traffic without
|
|
becoming the API endpoint clients talk to.
|
|
|
|
The proxy is **off by default**. Operators opt in by setting `--mitm-listen`
|
|
and distributing the generated CA cert.
|
|
|
|
## How it works
|
|
|
|
1. The proxy generates a private CA on first start (persisted to disk).
|
|
2. Clients set `HTTPS_PROXY=http://localai:port` and add the CA to their
|
|
trust store (e.g. `NODE_EXTRA_CA_CERTS` for Node-based CLIs like Claude
|
|
Code and Codex).
|
|
3. The CLI sends `CONNECT api.anthropic.com:443` to the proxy.
|
|
4. For allowlisted hosts, the proxy mints a per-host leaf cert signed by
|
|
the CA, terminates TLS, parses the HTTP request, applies the global
|
|
PII redactor on `/v1/messages` or `/v1/chat/completions`, and forwards
|
|
to the real upstream over its own TLS connection.
|
|
5. The streaming SSE response runs through the same `pii.StreamFilter`
|
|
the cloud-proxy backend uses.
|
|
6. For non-allowlisted hosts, the proxy is a plain CONNECT tunnel — no
|
|
TLS termination, no inspection, no CA trust required.
|
|
|
|
The CLI authenticates with its own subscription / API key as it normally
|
|
would. LocalAI never holds the credential — it just observes and rewrites
|
|
the request body.
|
|
|
|
## Quick start
|
|
|
|
Start LocalAI with the MITM listener:
|
|
|
|
```bash
|
|
local-ai run --mitm-listen :8443
|
|
```
|
|
|
|
The first start generates a CA at `<data-path>/mitm-ca/{ca.crt,ca.key}`.
|
|
Restarting reloads the same CA so clients keep trusting it.
|
|
|
|
Download the public CA cert:
|
|
|
|
```bash
|
|
curl -O http://localhost:8080/api/middleware/proxy-ca.crt
|
|
```
|
|
|
|
Configure Claude Code to use the proxy and trust the cert:
|
|
|
|
```bash
|
|
export HTTPS_PROXY=http://localhost:8443
|
|
export NODE_EXTRA_CA_CERTS=$(pwd)/proxy-ca.crt
|
|
claude
|
|
```
|
|
|
|
Now any `claude` chat session that touches `api.anthropic.com/v1/messages`
|
|
gets its prompts and tool inputs scanned by LocalAI's PII filter, and any
|
|
PII the model emits in its streaming response is masked before reaching
|
|
your terminal. Events appear in the LocalAI middleware admin page under
|
|
**Filtering → Recent events**.
|
|
|
|
The same works for Codex CLI — set `HTTPS_PROXY` and `NODE_EXTRA_CA_CERTS`
|
|
and run `codex`.
|
|
|
|
## Configuration
|
|
|
|
| Flag / env | Default | Purpose |
|
|
|---|---|---|
|
|
| `--mitm-listen` / `LOCALAI_MITM_LISTEN` | empty (disabled) | Address to bind the proxy listener on |
|
|
| `--mitm-ca-dir` / `LOCALAI_MITM_CA_DIR` | `<data-path>/mitm-ca` | Where to persist the CA cert + key |
|
|
| `--mitm-intercept-hosts` / `LOCALAI_MITM_INTERCEPT_HOSTS` | `api.anthropic.com,api.openai.com` | Hosts to terminate TLS for; everything else tunnels |
|
|
|
|
Hostnames are case-insensitive. Add custom upstreams (e.g. an
|
|
OpenAI-compatible third-party provider) by extending the allowlist and
|
|
ensuring their endpoint paths match `/v1/chat/completions` or
|
|
`/v1/messages`.
|
|
|
|
## What gets redacted
|
|
|
|
Same patterns the regular request middleware uses:
|
|
|
|
- Email addresses → masked
|
|
- Phone numbers → masked
|
|
- US Social Security Numbers → masked
|
|
- Credit card numbers (Luhn-verified) → masked
|
|
- IPv4 addresses → masked
|
|
- API key prefixes (`sk-`, `pk-`, `ghp_`, `github_pat_`, `xoxb-`) → **blocked**
|
|
|
|
A `block` action returns HTTP 400 with `error.type=pii_blocked` to the
|
|
client. The CLI sees the rejection and shows it as a request error.
|
|
|
|
Events are persisted via the same `pii.EventStore` the rest of LocalAI
|
|
uses, so the `/api/pii/events` endpoint and the middleware admin page
|
|
include MITM events alongside direct-API events.
|
|
|
|
## Security notes
|
|
|
|
- **The CA private key is the master credential.** Anyone with read
|
|
access to `<data-path>/mitm-ca/ca.key` can forge TLS for any host the
|
|
proxy could intercept. The file is mode 0600; keep it that way.
|
|
- The proxy listener accepts plaintext HTTP `CONNECT` requests — bind it
|
|
to localhost (`--mitm-listen 127.0.0.1:8443`) unless you've added auth
|
|
in front of the listener. There is no built-in API-key check on this
|
|
port.
|
|
- The MITM CA is **separate** from any TLS cert LocalAI's main HTTP API
|
|
uses. Installing the MITM CA grants trust only for traffic that flows
|
|
through this proxy.
|
|
- The proxy does not pin upstream certificates; it trusts the system
|
|
certificate store. If your machine's trust store is compromised, the
|
|
proxy is too.
|
|
- TLS termination negotiates HTTP/2 by default (ALPN `h2`) and falls
|
|
back to HTTP/1.1 for clients that don't speak h2. Modern CLIs (Claude
|
|
Code, Codex) and the Anthropic / OpenAI APIs all use h2.
|
|
|
|
## Limitations
|
|
|
|
- **Only `/v1/messages` and `/v1/chat/completions` get redacted.** Other
|
|
paths on the same host (OAuth, model listing) are forwarded verbatim.
|
|
- **No request-shape translation.** The proxy assumes the request body
|
|
matches the host's wire format; cross-shape forwarding is the cloud
|
|
proxy backend's job, not the MITM's.
|
|
- **No CA rotation in the MVP.** To rotate, delete `ca.key` and `ca.crt`
|
|
and re-distribute the new cert to every client.
|
|
- **Cert pinning kills MITM.** Neither Claude Code nor Codex CLI pins
|
|
certificates today, but a future SDK update could. If a CLI starts
|
|
refusing the proxied handshake, that's the signal.
|
|
|
|
## Comparison with the cloud-proxy backend
|
|
|
|
LocalAI ships two cloud-related proxy modes; pick by who holds the credential:
|
|
|
|
| | Cloud-proxy backend (`backend: proxy-*`) | MITM proxy (`--mitm-listen`) |
|
|
|---|---|---|
|
|
| Client config | `localai:8080` as **API endpoint** | `localai:8443` as **HTTPS_PROXY** |
|
|
| Holds API key | LocalAI | Client (CLI's own auth) |
|
|
| Works with subscription auth | No | Yes (CLI uses its own login) |
|
|
| Request rewriting | Yes (handler controls it) | Yes (selective per host+path) |
|
|
| CA cert distribution | Not needed | Required on every client |
|
|
| Routes through LocalAI's auth/usage tracking | Yes | Yes (per-correlation-id events) |
|
|
|
|
For shared deployments where LocalAI owns the API key and clients are
|
|
unsophisticated (curl, simple webapps), use the cloud-proxy backend. For
|
|
"give my Claude Code a privacy filter" use cases, use the MITM proxy.
|