feat(pii): NER tier engine — privacy-filter.cpp backend + NER-centric PII filter (#10360)

Squashed feat/pii-ner-tier-engine rebased onto master (was 45 commits; see
backup/pii-ner-tier-engine-prerebase). Net change:

- privacy-filter.cpp: standalone GGML engine for the openai-privacy-filter
  PII/NER token classifier, wired as a LocalAI gRPC backend (CPU/CUDA/Vulkan).
  TokenClassify moves off the patched llama.cpp path onto this backend.
- PII filter reworked to be NER-centric (encoder/NER detection tier scanning
  whole conversations as one document), with a recreated bounded restricted-
  regex secret-matching pattern detector tier alongside it (per-model
  pii_detection.builtins / .patterns + core/services/routing/piipattern).
- Detection labelled by source (ner vs pattern); backend trace / confidence /
  debug observability; analyze/redact exposed as a synchronous API.
- Instance-wide default detector policy + per-usecase default-on; request
  filtering extended to completions, embeddings, edits & Ollama.
- React UI: NER-centric PII editor, detector-models table, pattern/builtins
  editor, middleware default-policy UI.
- Gallery: privacy-filter-multilingual token-classify model + NER install
  filter; token_classify known_usecase; batch sized to context for NER models.
  privacy-filter backend registered in the backend gallery (cpu/vulkan/cuda-13
  meta + image entries with a capabilities map) matching its CI matrix jobs,
  and an /import-model auto-detect importer (PrivacyFilterImporter, narrow
  privacy-filter GGUF detection) replacing the prior pref-only registration.

Reconciled against master's independent evolution:

- Dropped master's PIIPatternOverrides feature (global-pattern runtime
  overrides + /api/pii/patterns API + runtime_settings.json persistence). The
  per-model NER + pattern-detector design supersedes it; it was built on the
  global redactor pattern set this branch replaced.
- Reverted the llama.cpp Score carry-patch (0006-server-task-type-score):
  removed the patch and restored master's grpc-server.cpp Score RPC (direct
  llama_decode, slot-loop bypass) and LLAMA_VERSION pin, plus master's
  model_config validation forbidding score + chat/completion/embeddings on
  llama-cpp. token_classify is unaffected (it runs on the privacy-filter
  backend, not llama-cpp).

Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
This commit is contained in:
Richard Palethorpe
2026-06-18 11:45:22 +01:00
committed by GitHub
parent c133ca39dc
commit 3fa7b2955c
134 changed files with 6671 additions and 4223 deletions

View File

@@ -12,9 +12,9 @@ categories = ["Features"]
LocalAI ships a request-middleware layer that sits between the HTTP API and
the backend dispatcher. Two subsystems share that layer because they share
the same lifecycle hook: **PII filtering** scans the request body before it
reaches a backend (and the SSE stream on the way out), and the **intelligent
router** rewrites `input.Model` so a single client-facing model name fans
out across multiple downstream targets.
reaches a backend, and the **intelligent router** rewrites `input.Model` so
a single client-facing model name fans out across multiple downstream
targets.
Both are inspected and configured from the same admin page
(`/app/middleware`), backed by the same REST surface (`/api/middleware/*`,
@@ -23,68 +23,136 @@ Both are inspected and configured from the same admin page
## Request lifecycle
```
client ── auth ── route-model ── per-model PII ── backend ── streaming PII ── client
└─── decision log └─── event log
client ── auth ── route-model ── per-model PII ── backend ── client
│ │
└─── event log
└─── decision log
```
The router runs first (it picks the target model so per-model PII has
something to gate on), per-model PII runs next (gated by the resolved
config), the backend executes, and the streaming PII filter rewrites the
SSE response in flight. Each subsystem writes to its own admin-visible
log: `/api/router/decisions` for routing, `/api/pii/events` for redaction
and block actions.
config), and the backend executes. Filtering is **request-side only**
the request body is scanned and rewritten before forwarding; the response
is not touched (NER over a streamed response is left as a follow-up). Each
subsystem writes to its own admin-visible log: `/api/router/decisions` for
routing, `/api/pii/events` for redaction and block actions.
---
## PII filtering
PII redaction is **per-model and off by default**. The default flips to
**on for any backend whose name starts with `proxy-`** because that traffic
crosses the network to a third-party provider. Explicit `pii.enabled`
in a model's YAML always wins over the backend default.
PII redaction is **NER-based and runs request-side (input)**. It is
**off by default**, flipping to **on for any `cloud-proxy` backend**
because that traffic crosses the network to a third-party provider. Pick a
[default detector](#instance-wide-defaults) so those models are actually
scanned. Explicit `pii.enabled` in a model's YAML always wins over the
backend default.
### Pattern catalog
Filtering runs on every text-accepting endpoint that has an adapter wired:
`/v1/chat/completions` and `/v1/messages` (chat), `/v1/completions`,
`/v1/embeddings`, `/v1/edits`, and the Ollama `/api/chat`, `/api/generate`
and `/api/embed` endpoints, plus the [MITM proxy]({{< relref "mitm-proxy.md" >}})
request body. Image, audio (TTS/STT), video, rerank, and the realtime
WebSocket are not filtered yet (different prompt-PII semantics; realtime is
not HTTP middleware).
The built-in regex tier ships six patterns. Each has a default action
(`mask`, `block`, or `allow`) and a length cap that prevents
pathological inputs from blowing up scanning time:
A request's messages are scanned **as one document** (joined in order), so
the NER detector keeps conversational context: whether `4421` is a PIN or
`jdoe_42` is a username is usually decided by the question asked in the
*previous* message, and a bidirectional encoder only sees that context when
the messages share a forward pass. Detected spans are mapped back to the
individual message they fall in, so redaction still rewrites each message
field in place and events carry message-local offsets.
| ID | Description | Default action | Max length |
|---|---|---|---|
| `email` | Email address | `mask` | 254 |
| `phone` | Phone number (international or US) | `mask` | 24 |
| `ssn` | US Social Security Number | `mask` | 11 |
| `credit_card` | Credit card number (Luhn-verified) | `mask` | 19 |
| `ipv4` | IPv4 address | `mask` | 15 |
| `api_key_prefix` | `sk-`, `pk-`, `xoxb-`, `ghp_`, `github_pat_` | **`block`** | 200 |
> The earlier regex pattern tier (`pii.patterns`, the built-in pattern
> catalogue, `--pii-config`, the `/api/pii/patterns|test|decide` endpoints)
> and response/streaming-side redaction have been **removed**. Detection is
> now driven entirely by token-classification (NER) models. Legacy keys
> no-op with a startup warning.
`mask` rewrites the match to `[REDACTED:<id>]` in the request body before
forwarding. `block` returns HTTP 400 with `error.type=pii_blocked` to the
client without forwarding. `allow` detects and logs the match (a PIIEvent
is still recorded) but leaves the text unchanged — use it to downgrade a
pattern's default for a model while keeping it visible in the audit log.
It is also the foundation for surfacing detected-PII labels to the router,
a planned router-model feature.
### Detector models
### Per-model configuration
Add a `pii:` block to a model YAML to opt in (or out, or to override
per-pattern actions):
A **detector** is a `token_classify` model (e.g. an `openai-privacy-filter`
GGUF) that carries the detection *policy* in a top-level `pii_detection:`
block — defined once, on the model itself:
```yaml
# Local model — explicit opt-in so chats with this model get redaction
# applied request-side.
name: qwen-7b-local
backend: llama-cpp
pii:
enabled: true
name: privacy-filter-multilingual
backend: privacy-filter
embeddings: true # TOKEN_CLS pooling
known_usecases:
- token_classify
pii_detection:
min_score: 0.5 # drop detections below this confidence
default_action: mask # applied to any detected group with no entry
entity_actions: # which PII to block vs mask vs allow-log
PASSWORD: block
CREDITCARD: block
EMAIL: mask
```
`mask` rewrites the matched span to `[REDACTED:ner:<GROUP>]` in the request
body before forwarding. `block` returns HTTP 400 (`error.type=pii_blocked`)
without forwarding. `allow` detects and logs (a PIIEvent is still recorded)
but leaves the text unchanged. The entity-group names are whatever the model
emits (the privacy-filter family uses uppercase names like `EMAIL`,
`PASSWORD`, `CREDITCARD`).
### Pattern detector tier
NER is the wrong tool for high-entropy, highly-regular **secrets** — API keys,
tokens, private-key blocks. A trained NER model has no "API key" class, so it
fragments a key into the nearest categories it *does* know and can leave the
secret part exposed. Those secrets are exactly what a regex catches cheaply.
A **pattern detector** is a detector model (`backend: pattern`) that matches
secrets with a **restricted regex subset** compiled to Go's RE2 engine —
linear-time, no backtracking, no ReDoS. It runs entirely in-process: no model
download, no backend, zero VRAM. Install the gallery's **`secret-filter`** for a
ready-made set, or define your own:
```yaml
name: secret-filter
backend: pattern
known_usecases: [token_classify] # so it appears in the detector picker
pii_detection:
default_action: block # a leaked credential shouldn't leave
builtins: # built-in catalogue (enable by name)
- anthropic_api_key
- openai_api_key
- github_token
- aws_access_key
- private_key_block
patterns: # operator-defined, restricted subset
- name: INTERNAL_TOKEN
match: "tok-[A-Za-z0-9]{32,64}"
action: block # optional per-pattern override
min_len: 36 # optional length floor
```
A match is reported under its group (built-in group name, or the pattern
`name`), so `entity_actions` / `default_action` apply exactly as for NER.
**The restricted grammar** (validated at load — an invalid pattern is rejected,
not silently ignored):
- Allowed: literals, character classes `[…]` and `\w \d \s`, alternation,
anchors `^ $ \b`, and quantifiers `? * + {m,n}`.
- Rejected: `.` (any-char), capturing groups, and `{n,m}` bounds over 4096.
- **Required anchor**: every pattern must contain a fixed literal run of at
least 3 characters (e.g. `sk-ant-`, `ghp_`, `AKIA`). This admits real key
shapes but rejects open-ended ones — an email or a bare `\w+` has no such
anchor and belongs to the [NER tier](#detector-models).
Use both tiers together: reference an NER detector *and* a pattern detector in a
model's `pii.detectors` (or as instance defaults); their hits union, and a
`block` from either rejects the request.
### Consuming models
Any model opts in by enabling PII and referencing one or more detectors —
no per-consumer policy:
```yaml
# Cloud-bound model — defaults to enabled because backend is cloud-proxy.
# Tighten api_key_prefix from the global default and downgrade email to
# allow so emails are logged but pass through unchanged.
name: claude-strict
backend: cloud-proxy
proxy:
@@ -93,85 +161,139 @@ proxy:
upstream_url: https://api.anthropic.com/v1/messages
api_key_env: ANTHROPIC_API_KEY
pii:
patterns:
- id: api_key_prefix
action: block # already the default, made explicit for audit
- id: email
action: allow
enabled: true # default-on for cloud-proxy; explicit for audit
detectors:
- privacy-filter-multilingual
```
The regex itself stays global — only the action is settable per-model.
Adding new patterns is a build-time concern (extend `patternRegexps` in
`core/services/routing/pii/patterns.go`).
Multiple detectors **union** their detections; overlapping spans resolve to
the strongest action (`block` > `mask` > `allow`). A configured detector
that can't be loaded **fails the request closed** (HTTP 503,
`error.type=pii_ner_unavailable`) rather than silently skipping the check.
The same NER path runs on the [MITM proxy]({{< relref "mitm-proxy.md" >}})
request body for intercepted hosts. Response/output redaction is out of
scope for now.
### NER tier (optional)
### Instance-wide default detector
The regex matcher covers high-precision patterns. For natural-language
PII (proper names, addresses, organization names) LocalAI carries an
**encoder NER tier** that runs after the regex pass. It expects a
transformers token-classification model wired through the `TokenClassify`
gRPC primitive (e.g. `dslim/bert-base-NER`). The detector annotates
spans with an entity group (`PER`, `LOC`, `ORG`, `MISC`); per-group
actions are configurable through the same `pii:` block.
The **Detector models** table on the Middleware → Filtering page lists every
`token_classify` detector model (neural NER models and in-process pattern
matchers alike) and exposes a per-row **Default** toggle. Toggling a detector
on adds it to the instance-wide default detector set — one or more models
applied to any PII-enabled model that names none of its own `pii.detectors`.
It is persisted through `POST /api/settings` and read live, so a change takes
effect on the next request without a restart. A default that names a model no
longer loaded still appears (marked *not loaded*) so it can be toggled off.
The NER tier ships as a contract (`NERDetector`, `NERConfig` in
`core/services/routing/pii/ner.go`); an operator-facing knob to load and
attach a detector is not plumbed yet. When no detector is configured the
regex tier still runs.
This is what makes `cloud-proxy` / MITM redaction work out of the box: those
backends default to PII-enabled but ship no detector list, so without a
default detector the filter runs with nothing to scan. Set one here and
cloud-proxy traffic is scanned with no per-model config.
### Streaming PII filter
Resolution precedence (the single decision point is `ResolvePIIPolicy`,
shared by the chat middleware and the MITM listener so both agree):
Buffered (`/v1/chat/completions` without `"stream": true`) responses are
forwarded verbatim today — only the request-side scan runs. Streaming
responses run through `pii.StreamFilter` which buffers SSE chunks until
either a full pattern matches or the buffer's max length is reached,
then emits the safe prefix. The streaming filter is what makes the
cloud-proxy backend and the MITM proxy safe to expose to clients that
issue streaming requests.
1. An explicit `pii.enabled` on the model wins — `true` or `false`.
2. Otherwise PII is on if the backend defaults it on (`cloud-proxy`).
3. Detectors are the model's own `pii.detectors`; if it lists none, the
instance-wide default detector(s) are used.
The streaming filter is wired automatically for any model with `pii.enabled`
true — there is no separate streaming toggle.
A model that resolves enabled but ends up with no detector at all (a
cloud-proxy model with no model detectors and no instance default) scans
nothing — set a default detector to close that gap.
### Admin page
The `/app/middleware` page (admin role only) has four tabs — **Filtering**,
**Routing**, **MITM Proxy** (see the [MITM doc]({{< relref "mitm-proxy.md" >}})),
and **Events**. The Filtering tab shows:
and **Events**. The Filtering tab has a **Detector models** table (every
`token_classify` filter model, with the per-row Default toggle above and an
edit link to each detector's config, plus an *Add detector model* button) and
a per-model table listing only the models PII can actually apply to — chat /
completion / embeddings / edit consumers and cloud-proxy models, not
VAD/STT/image models or the detector models themselves. Each row reports the
**effective** `enabled` state as an inline **toggle** — flipping it writes an
explicit `pii.enabled` to that model's YAML (a server-side deep-merge that
preserves `pii.detectors` and every other field), so a cloud-proxy model shown
on by backend default can be turned off, and vice-versa — plus the
resolved detector(s) — with a *(default)* marker when they come from the
instance-wide default rather than the model's YAML — why it is on (`YAML` /
`backend default`), and the recent event count. Detection *policy*
(entity→action, min score) is still edited on each detector model's config
(Models → edit → PII), not globally.
- The pattern catalogue with live action dropdowns. Changing an action via
the UI calls `PUT /api/pii/patterns/:id` and updates the live redactor
in-process. Click **Persist** in the action header to write the current
state into `runtime_settings.json` so the next process start re-applies it.
- A per-model resolved-state table — each model row reports `enabled`,
the per-pattern overrides, and which patterns are effectively active.
- A live test panel that posts sample text to `/api/pii/test` and
highlights matches with their resolved actions, without storing the
text in the event log.
### Analyze / redact API
The same detection pipeline is also exposed as a standalone service, so a
client can scan or sanitise a string **without** routing a full chat request
through it (the inline path above). Two endpoints, both requiring a normal API
key (the `pii_filter` feature — not admin):
- `POST /api/pii/analyze` — detect only. Returns the matched entity spans
(`entity_type`, `source` `ner`|`pattern`, `start`/`end`, `score`, `action`)
and a `blocked` flag, **without modifying the text**.
- `POST /api/pii/redact` — apply the configured policy. Returns `redacted_text`
(with masked spans replaced by `[REDACTED:<id>]`) and `masked`; when a `block`
action fires it returns `400` with `type: pii_blocked` and the offending
entities — never a redacted body.
Both take the same request: `text` plus a detector selection — either explicit
detector model names in `detectors`, or a consuming `model` whose **effective**
policy is used: the model's own `pii.detectors`, else the
[instance-wide default detectors](#instance-wide-default-detector), exactly as
the inline filter resolves them. A `model` with PII disabled — or enabled but
with no detector anywhere — is a `400`: the inline filter would scan nothing
for it, and the API says so rather than implying a clean scan. The detection
policy lives on the detector models exactly as for the inline filter. The raw
matched value is never returned (an admin may pass `reveal: true` to include
the audit `hash_prefix`).
`text` is scanned as a single document. To reproduce the inline filter's
conversation-context behaviour for multi-message content, join the messages
with blank lines into one `text` — NER detection quality depends on that
context (a bare `4421` is nothing; after "what are the last four digits of
your card?" it is a PIN).
```bash
# Redact with an explicit pattern/NER detector
curl -sX POST http://localhost:8080/api/pii/redact \
-H 'Authorization: Bearer $API_KEY' -H 'Content-Type: application/json' \
-d '{"text":"reach me at jane@acme.io","detectors":["my-ner-model"]}'
# => {"redacted_text":"reach me at [REDACTED:ner:EMAIL]","masked":true,...}
# Analyze using a consuming model's configured detectors
curl -sX POST http://localhost:8080/api/pii/analyze \
-H 'Authorization: Bearer $API_KEY' -H 'Content-Type: application/json' \
-d '{"text":"sk-ant-api03-…","model":"gpt-4"}'
# => {"entities":[{"entity_type":"ANTHROPIC_KEY","source":"pattern",...,"action":"block"}],"blocked":true}
```
Calls are audited in the same event log, tagged with an `origin` of
`pii_analyze` / `pii_redact` (the inline filter records `middleware`, the MITM
proxy records `proxy`), so `GET /api/pii/events?origin=pii_redact` shows just
the redact-API rows.
### REST surface
| Method | Path | Auth | Purpose |
|---|---|---|---|
| GET | `/api/pii/patterns` | any | Live pattern list with current actions. Used by the UI catalogue. |
| POST | `/api/pii/test` | any | Dry-run the redactor on `{"text":"..."}`. Returns hits and the would-be-rewritten body. Does not write to the event log. |
| GET | `/api/pii/events` | admin | Recent middleware events — PII redactions, MITM connect/traffic, admission denials. Filterable by `correlation_id`, `user_id`, `pattern_id`, `kind`. |
| PUT | `/api/pii/patterns/:id` | admin | Update a pattern in-process. Body accepts `{"action":"mask"\|"block"\|"allow"}` and/or `{"disabled":true\|false}`. Transient — reverts on restart unless persisted. |
| POST | `/api/pii/patterns/persist` | admin | Snapshot the live per-pattern (action, disabled) state into `runtime_settings.json`. |
| GET | `/api/middleware/status` | admin | Aggregated dashboard data: patterns + per-model resolved state + router status + MITM status + admission status. One round-trip for the UI. |
| POST | `/api/pii/analyze` | api key (`pii_filter`) | Detect PII in a string; returns entity spans, no mutation. |
| POST | `/api/pii/redact` | api key (`pii_filter`) | Redact a string per policy; returns `redacted_text` or `400 pii_blocked`. |
| GET | `/api/pii/events` | admin | Recent middleware events — PII redactions, MITM connect/traffic, admission denials. Filterable by `correlation_id`, `user_id`, `pattern_id` (e.g. `ner:EMAIL`), `kind`, `origin`. |
| GET | `/api/middleware/status` | admin | Aggregated dashboard data: per-model PII state + detectors + router status + MITM status + admission status. One round-trip for the UI. |
### MCP tools
The same surface is mirrored through the LocalAI Assistant MCP server so
the in-process and stdio assistants can manage the filter conversationally:
The same surface is mirrored through the LocalAI Assistant MCP server:
| Tool | Read/Write | Purpose |
|---|---|---|
| `list_pii_patterns` | read | Returns the live pattern list. |
| `get_pii_events` | read | Recent redaction / block events with optional filters. |
| `test_pii_redaction` | read | Dry-run sample text without writing to the event log. |
| `get_middleware_status` | read | Aggregator — the same payload as `GET /api/middleware/status`. |
| `set_pii_pattern_action` | write | Update a pattern's action. Admin-only. |
| `persist_pii_patterns` | write | Snapshot live state to `runtime_settings.json`. Admin-only. |
Detection policy is part of a detector model's config, so it is managed
through the model-config tools (`edit_model_config`), not a dedicated PII
tool.
---
@@ -257,10 +379,11 @@ ChatML instruct model works under those constraints, but expect flatter
probability distributions which translate to a higher
`activation_threshold` to keep noise out of the active label set.
On llama-cpp, declare `known_usecases: [score]` on the classifier
model — LocalAI rejects configs that combine `score` with
`chat`/`completion`/`embeddings` there, because the Score RPC races
the `llama_context` against slot-loop traffic.
On llama-cpp, scoring rides the server's task queue alongside
generation and embeddings, so the classifier may share a model config
with `chat`/`completion`/`embeddings` — a dedicated scorer model is no
longer required. Repeated calls with the same prompt also reuse the
prompt's KV cache across candidates.
### The Colbert classifier