* fix(pii): post-merge review fixes + live NER e2e for the privacy-filter tier
Follow-up to the NER tier engine (#10360), already on master. This carries
only the incremental review fixes and tests that postdate that merge — the
feature itself is not re-introduced.
Review fixes:
- openai_completion.go: remove the dead `elem >= 0` conjunct in applyAnyText
(the `elem < 0` guard above already returns).
- application.go: collapse ResolvePIIPolicy's inline re-implementation of
PIIIsEnabled to a single cfg.PIIIsEnabled() call (sole source of the
"explicit pii.enabled wins, else cloud-proxy default" rule) and return true
past the !enabled guard where it is provable.
- pattern.go: hoist the triple `appConfig != nil && EnableTracing` check in
patternDetector.Detect into one local.
- grammar.go: MaxQuantifier was 4096, but Go's regexp/syntax rejects repeat
bounds above 1000 at Parse time, so walk()'s {n,m} guard could never fire —
dead code shadowed by the parser. Lower it to 512 so a bound in (512,1000]
is rejected here with an actionable error; >1000 still fails closed via
Parse. Specs pin the relationship so the guard can't silently revert.
- PatternListEditor.jsx: clamp a directly-typed negative min_len to >=0 and
force the DOM value back when clamping (min={0} only constrained the spinner,
so a negative reached saved config and silently disabled the length filter).
Tests:
- piipattern_test.go: MaxQuantifier guard specs (must stay live, not dead).
- model-config.spec.js: assert the min_len clamp, and that entity_actions
collapses a duplicate group to a single row (map semantics; regression guard
against emitting an array that drops a row on save).
- tests/e2e-backends: token_classify capability driving the TokenClassify gRPC
RPC against the backend image, asserting byte-correct, UTF-8 rune-aligned
spans (entity.Text == text[start:end]) at threshold 0. Verified on CPU via
`make test-extra-backend-privacy-filter` (3/3 specs).
- Makefile: test-extra-backend-privacy-filter wrapper.
- tests/e2e: e2e_pii_ner_test.go drives /api/pii/analyze + /api/pii/redact
(mask + block) through the full HTTP -> detector -> redactor path; gated on
PII_NER_MODEL_GGUF so the default suite is unaffected.
- .github/workflows/tests-pii-ner-e2e.yml: path-filtered / nightly CI job
running the container harness on CPU.
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat(gallery): add privacy-filter-nemotron (f16 + q8)
GGUF conversions of OpenMed/privacy-filter-nemotron — a fine-grained English
PII token-classifier (55 categories / 221 BIOES classes), fine-tuned from
openai/privacy-filter on NVIDIA's Nemotron-PII dataset. Sibling to the existing
privacy-filter-multilingual entry, trading language breadth for category depth.
- privacy-filter-nemotron: F16 reference artifact (~2.8 GB).
- privacy-filter-nemotron-q8: Q8_0 quant (~1.64 GB) for RAM-constrained / edge
use; description notes the size/speed tradeoff and to validate on your own
data (a single dropped span is a PII leak).
Both run on the privacy-filter backend with known_usecases [token_classify] and
a default mask policy (min_score 0.5); operators add per-category entity_actions
as needed. sha256s taken from the HF repo's LFS object ids.
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Squashed feat/pii-ner-tier-engine rebased onto master (was 45 commits; see
backup/pii-ner-tier-engine-prerebase). Net change:
- privacy-filter.cpp: standalone GGML engine for the openai-privacy-filter
PII/NER token classifier, wired as a LocalAI gRPC backend (CPU/CUDA/Vulkan).
TokenClassify moves off the patched llama.cpp path onto this backend.
- PII filter reworked to be NER-centric (encoder/NER detection tier scanning
whole conversations as one document), with a recreated bounded restricted-
regex secret-matching pattern detector tier alongside it (per-model
pii_detection.builtins / .patterns + core/services/routing/piipattern).
- Detection labelled by source (ner vs pattern); backend trace / confidence /
debug observability; analyze/redact exposed as a synchronous API.
- Instance-wide default detector policy + per-usecase default-on; request
filtering extended to completions, embeddings, edits & Ollama.
- React UI: NER-centric PII editor, detector-models table, pattern/builtins
editor, middleware default-policy UI.
- Gallery: privacy-filter-multilingual token-classify model + NER install
filter; token_classify known_usecase; batch sized to context for NER models.
privacy-filter backend registered in the backend gallery (cpu/vulkan/cuda-13
meta + image entries with a capabilities map) matching its CI matrix jobs,
and an /import-model auto-detect importer (PrivacyFilterImporter, narrow
privacy-filter GGUF detection) replacing the prior pref-only registration.
Reconciled against master's independent evolution:
- Dropped master's PIIPatternOverrides feature (global-pattern runtime
overrides + /api/pii/patterns API + runtime_settings.json persistence). The
per-model NER + pattern-detector design supersedes it; it was built on the
global redactor pattern set this branch replaced.
- Reverted the llama.cpp Score carry-patch (0006-server-task-type-score):
removed the patch and restored master's grpc-server.cpp Score RPC (direct
llama_decode, slot-loop bypass) and LLAMA_VERSION pin, plus master's
model_config validation forbidding score + chat/completion/embeddings on
llama-cpp. token_classify is unaffected (it runs on the privacy-filter
backend, not llama-cpp).
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>