chore(deps): update numpy requirement in /backend/python/transformers

Updates the requirements on [numpy](https://github.com/numpy/numpy) to permit the latest version. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/RELEASE_WALKTHROUGH.rst) - [Commits](https://github.com/numpy/numpy/compare/v2.0.0...v2.4.6) --- updated-dependencies: - dependency-name: numpy dependency-version: 2.4.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
feat: add flake.nix for dockerless setup (#9851 )
2026-05-20 06:35:41 -04:00 · 2026-05-19 03:51:23 +00:00 · 2026-05-18 15:23:10 +01:00 · 2026-05-18 08:02:20 +02:00 · 2026-05-18 08:01:30 +02:00 · 2026-05-17 23:20:16 +02:00
86 changed files with 3186 additions and 323 deletions
--- a/.agents/adding-backends.md
+++ b/.agents/adding-backends.md
@@ -112,6 +112,8 @@ Add a YAML anchor definition in the `## metas` section (around line 2-300). Look

 Add image entries at the end of the file, following the pattern of similar backends such as `diffusers` or `chatterbox`. Include both `latest` (production) and `master` (development) tags.

+**Note on integrity:** OCI backends installed from a gallery whose `verification:` block is set are verified against a keyless-cosign policy before extraction; tarball/HTTP backends use the optional `sha256:` field. New backends do not need any extra YAML — the gallery-level `verification:` block covers every entry. See [.agents/backend-signing.md](backend-signing.md) for the producer-side CI step.
+
 ## 4. Update the Makefile

 The Makefile needs to be updated in several places to support building and testing the new backend:
--- a/.agents/backend-signing.md
+++ b/.agents/backend-signing.md
@@ -0,0 +1,120 @@
+# Backend image signing & verification
+
+LocalAI verifies backend OCI images against a per-gallery keyless-cosign
+policy. This page documents the trust model, the producer side
+(`.github/workflows/backend_merge.yml` in this repo), and the consumer
+side (`pkg/oci/cosignverify` plus the gallery YAML).
+
+## Trust model
+
+- **Producer:** `.github/workflows/backend_merge.yml` signs each pushed
+  manifest list with `cosign sign --recursive` in keyless mode after
+  `docker buildx imagetools create`. The signing cert is issued by
+  Fulcio bound to the workflow's OIDC identity. There is no long-lived
+  signing key. `--recursive` signs both the manifest list and every
+  per-arch entry — needed because our consumer resolves a tag to a
+  per-arch manifest before checking signatures.
+- **Storage:** Signatures are written as OCI 1.1 referrers
+  (`--registry-referrers-mode=oci-1-1`) in the new Sigstore bundle format
+  (`--new-bundle-format`). No `:sha256-<hex>.sig` tag clutter.
+- **Consumer:** `pkg/oci/cosignverify` discovers the bundle via the
+  referrers API, hands it to `sigstore-go`, and verifies it against the
+  policy declared in the gallery YAML (`Gallery.Verification`).
+- **Revocation:** Keyless cosign certs are ephemeral (10-minute Fulcio
+  validity), so revocation is policy-side, not CA-side. The gallery's
+  `verification.not_before` (RFC3339) is the kill-switch — advance it to
+  invalidate every signature produced before a known compromise window.
+
+## Producer setup
+
+`backend_merge.yml` is the workflow that joins per-arch digests into the
+multi-arch manifest list users actually pull, so it's also the right place
+to sign. The job needs:
+
+- `permissions: { id-token: write, contents: read }` at the job level so
+  the runner can exchange its GitHub OIDC token for a Fulcio cert.
+- `sigstore/cosign-installer@v3` step (cosign ≥ 2.2 for
+  `--new-bundle-format`).
+- After each `docker buildx imagetools create`, resolve the resulting
+  list digest with `docker buildx imagetools inspect <tag> --format
+  '{{.Manifest.Digest}}'` and sign:
+
+```sh
+cosign sign --yes --recursive \
+  --new-bundle-format \
+  --registry-referrers-mode=oci-1-1 \
+  "${REGISTRY_REPO}@${DIGEST}"
+```
+
+Sign by digest, never by tag — signing by tag binds the signature to
+whatever the tag points at *now*, and a subsequent tag push orphans it.
+
+`backend_build_darwin.yml` builds and pushes single-arch darwin images
+that bypass the manifest-list merge. If/when those entries get a gallery
+`verification:` policy, the equivalent cosign step has to land there
+too.
+
+## Consumer setup (in `mudler/LocalAI` gallery YAML)
+
+Once CI is signing, add a `verification:` block to the backend gallery
+entry (`backend/index.yaml`):
+
+```yaml
+- name: localai
+  url: github:mudler/LocalAI/backend/index.yaml@master
+  verification:
+    issuer: "https://token.actions.githubusercontent.com"
+    identity_regex: "^https://github\\.com/mudler/LocalAI/\\.github/workflows/backend_merge\\.yml@refs/heads/master$"
+    # Optional revocation cutoff; advance during incident response.
+    # not_before: "2026-06-01T00:00:00Z"
+```
+
+Identity matching pins the OIDC subject Fulcio issued the signing cert
+to. Without this, any image signed by *anyone* with a Fulcio cert would
+pass — the regex is what makes a signature mean "produced by our CI".
+
+## Strict mode
+
+Default behaviour: OCI backends without a `verification:` block install
+with a warning (logs include `installing OCI backend without signature
+verification`). Tarball/HTTP backends without a `sha256` field log a
+similar warning.
+
+For production, set `LOCALAI_REQUIRE_BACKEND_INTEGRITY=1` (or pass
+`--require-backend-integrity` to `local-ai run` / `local-ai backends
+install` / `local-ai models install`). The warning becomes a hard error
+and unverifiable backends refuse to install.
+
+## Revocation playbook
+
+If `backend_merge.yml` (or any workflow with `id-token: write`) is
+compromised and we've shipped malicious signed images:
+
+1. **Identify the compromise window.** Find the earliest IntegratedTime
+   from the bad signatures (Rekor search by `subject` filter).
+2. **Set `verification.not_before`** in `backend/index.yaml` to a
+   timestamp just *after* that window's start.
+3. **Push the YAML.** Deployed LocalAI instances pick it up on next
+   gallery refresh (1-hour cache in `core/gallery/gallery.go`).
+4. **Fix the underlying compromise** in the workflow and re-sign images
+   with the new build, which will have IntegratedTime > `not_before`.
+5. **Optional:** for absolute decisiveness, also rotate to a new
+   workflow path (`backend_merge_v2.yml`) and update `identity_regex`.
+
+## Where the code lives
+
+- `pkg/oci/cosignverify/` — verifier, policy, OCI referrer fetch, NotBefore enforcement.
+- `pkg/downloader/uri.go` — `WithImageVerifier` option threaded through `DownloadFileWithContext`.
+- `core/gallery/backends.go` — `backendDownloadOptions` builds the verifier from the gallery's policy.
+- `core/config/gallery.go` — `Gallery.Verification` YAML schema.
+- `core/cli/run.go`, `core/cli/backends.go`, `core/cli/models.go` — `--require-backend-integrity` flag propagation.
+- `.github/workflows/backend_merge.yml` — producer-side `cosign sign --recursive` after each multi-arch manifest list push.
+
+## Out of scope (follow-ups)
+
+- **Signing the gallery YAML itself.** The index is fetched over HTTPS
+  from GitHub; we trust the host. A cosign blob signature on the YAML
+  would close that gap but adds key-management overhead. Revisit this
+  page if/when added.
+- **Tarball/HTTP backend signing.** Cosign can sign arbitrary blobs, but
+  for now non-OCI backends keep using the `sha256:` field in YAML.
--- a/.agents/llama-cpp-backend.md
+++ b/.agents/llama-cpp-backend.md
@@ -61,6 +61,12 @@ Always check `llama.cpp` for new model configuration options that should be supp
   - `reasoning_format` - Reasoning format options
   - Any new flags or parameters

+### Speculative Decoding Types
+
+The `spec_type` option in `grpc-server.cpp` delegates to upstream's `common_speculative_types_from_names()`, so new speculative types added to the `common_speculative_type_from_name` map in `common/speculative.cpp` are picked up automatically with no code changes - only docs need an entry in `docs/content/advanced/model-configuration.md`. Current values: `none`, `draft-simple`, `draft-eagle3`, `draft-mtp`, `ngram-simple`, `ngram-map-k`, `ngram-map-k4v`, `ngram-mod`, `ngram-cache`.
+
+`draft-mtp` (Multi-Token Prediction, [ggml-org/llama.cpp#22673](https://github.com/ggml-org/llama.cpp/pull/22673)) does not need a separate draft GGUF: when `spec_type` includes `draft-mtp` and `draftmodel` is empty, the upstream server creates an MTP context off the target model itself. LocalAI's gRPC layer needs no changes for this — it works through the existing `params.speculative.types` plumbing and the derived `cparams.n_rs_seq = params.speculative.need_n_rs_seq()` in `common_context_params_to_llama`.
+
 ### Implementation Guidelines

 1. **Feature Parity**: Always aim for feature parity with llama.cpp's implementation
--- a/.github/workflows/backend_merge.yml
+++ b/.github/workflows/backend_merge.yml
@@ -31,6 +31,13 @@ on:
 jobs:
  merge:
    runs-on: ubuntu-latest
+    # id-token: write is required for keyless cosign — the workflow
+    # exchanges the GitHub OIDC token for a short-lived Fulcio cert that
+    # signs each pushed manifest. Without this permission the runner
+    # cannot mint the token, and `cosign sign` fails with "no token".
+    permissions:
+      contents: read
+      id-token: write
    env:
      quay_username: ${{ secrets.quayUsername }}
    steps:
@@ -57,6 +64,15 @@ jobs:
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@master

+      # cosign signs each pushed manifest list with --recursive so the
+      # index and every per-arch entry get an attached Sigstore bundle.
+      # 2.2+ is required for --new-bundle-format.
+      - name: Install cosign
+        if: github.event_name != 'pull_request'
+        uses: sigstore/cosign-installer@v3
+        with:
+          cosign-release: 'v2.4.1'
+
      - name: Login to DockerHub
        if: github.event_name != 'pull_request'
        uses: docker/login-action@v4
@@ -120,11 +136,26 @@ jobs:
          ' <<< "$DOCKER_METADATA_OUTPUT_JSON")
          if [ -z "$tags" ]; then
            echo "No quay.io tags from docker/metadata-action; skipping quay merge"
-          else
-            # shellcheck disable=SC2086
-            docker buildx imagetools create $tags \
-              $(printf 'quay.io/go-skynet/ci-cache@sha256:%s ' *)
+            exit 0
          fi
+          # shellcheck disable=SC2086
+          docker buildx imagetools create $tags \
+            $(printf 'quay.io/go-skynet/ci-cache@sha256:%s ' *)
+          # Resolve the manifest-list digest (any tag points at it) so
+          # cosign can sign by digest. Signing by tag would leave the
+          # signature orphaned the next time the tag moves.
+          first_tag=$(jq -cr '
+            .tags | map(select(startswith("quay.io/"))) | .[0]
+          ' <<< "$DOCKER_METADATA_OUTPUT_JSON")
+          digest=$(docker buildx imagetools inspect "$first_tag" --format '{{.Manifest.Digest}}')
+          # --recursive walks the list and signs every per-arch entry
+          # too — clients that resolve a tag to a platform-specific
+          # manifest before checking signatures need the per-arch
+          # signatures, not just the list-level one.
+          cosign sign --yes --recursive \
+            --new-bundle-format \
+            --registry-referrers-mode=oci-1-1 \
+            "quay.io/go-skynet/local-ai-backends@${digest}"

      - name: Create manifest list and push (dockerhub)
        if: github.event_name != 'pull_request'
@@ -139,11 +170,19 @@ jobs:
          ' <<< "$DOCKER_METADATA_OUTPUT_JSON")
          if [ -z "$tags" ]; then
            echo "No dockerhub tags from docker/metadata-action; skipping dockerhub merge"
-          else
-            # shellcheck disable=SC2086
-            docker buildx imagetools create $tags \
-              $(printf 'localai/localai-backends@sha256:%s ' *)
+            exit 0
          fi
+          # shellcheck disable=SC2086
+          docker buildx imagetools create $tags \
+            $(printf 'localai/localai-backends@sha256:%s ' *)
+          first_tag=$(jq -cr '
+            .tags | map(select(startswith("localai/"))) | .[0]
+          ' <<< "$DOCKER_METADATA_OUTPUT_JSON")
+          digest=$(docker buildx imagetools inspect "$first_tag" --format '{{.Manifest.Digest}}')
+          cosign sign --yes --recursive \
+            --new-bundle-format \
+            --registry-referrers-mode=oci-1-1 \
+            "localai/localai-backends@${digest}"

      - name: Inspect manifest
        if: github.event_name != 'pull_request'
--- a/.golangci.yml
+++ b/.golangci.yml
@@ -46,8 +46,52 @@ linters:
          msg: 'LocalAI tests must use Ginkgo/Gomega; use Fail(...) instead of t.Fail. See .agents/coding-style.md.'
        - pattern: '^t\.FailNow$'
          msg: 'LocalAI tests must use Ginkgo/Gomega; use Fail(...) instead of t.FailNow. See .agents/coding-style.md.'
+        # In-process config should flow through ApplicationConfig / kong-bound
+        # CLI flags, not via os.Getenv. The CLI layer is the legitimate
+        # env→struct boundary (kong's `env:"..."` tag); anything deeper that
+        # reads env directly leaks process state into business logic and
+        # makes flags impossible to test or override per-request. Backend
+        # subprocesses, the system/capabilities probe, and a few places that
+        # read non-LocalAI env vars (HOME, PATH, AUTH_TOKEN passed by parent)
+        # are exempt — see linters.exclusions.rules below.
+        - pattern: '^os\.(Getenv|LookupEnv|Environ)$'
+          msg: 'Plumb config through ApplicationConfig (or the relevant CLI struct) instead of reading env directly. CLI entry points (core/cli/) bind env vars via kong''s `env:` tag — that is the only sanctioned env→struct boundary. See .agents/coding-style.md.'
  exclusions:
    paths:
      # Upstream whisper.cpp source tree fetched by the whisper backend Makefile.
      - 'backend/go/whisper/sources'
      - 'docs/'
+    rules:
+      # CLI entry points: kong's `env:"..."` tag is the legitimate env→struct
+      # boundary, and a handful of subcommands legitimately propagate values
+      # to spawned subprocesses (LLAMACPP_GRPC_SERVERS, MLX hostfile, ...).
+      - path: ^core/cli/
+        text: 'os\.(Getenv|LookupEnv|Environ)'
+        linters: [forbidigo]
+      # Backend subprocesses are independent binaries with their own env
+      # surface; they're not "in-process config" of the LocalAI server.
+      - path: ^backend/
+        text: 'os\.(Getenv|LookupEnv|Environ)'
+        linters: [forbidigo]
+      # System capability probe reads HOME, PATH-style vars to discover
+      # GPUs, default paths, etc. — not LocalAI config.
+      - path: ^pkg/system/
+        text: 'os\.(Getenv|LookupEnv|Environ)'
+        linters: [forbidigo]
+      # gRPC server reads AUTH_TOKEN passed in by the parent process at spawn
+      # time; model.Loader sets/inherits env to communicate with subprocesses.
+      - path: ^pkg/grpc/
+        text: 'os\.(Getenv|LookupEnv|Environ)'
+        linters: [forbidigo]
+      - path: ^pkg/model/
+        text: 'os\.(Getenv|LookupEnv|Environ)'
+        linters: [forbidigo]
+      # Top-level main binaries (local-ai, launcher) are entry points.
+      - path: ^cmd/
+        text: 'os\.(Getenv|LookupEnv|Environ)'
+        linters: [forbidigo]
+      # Tests legitimately read $HOME, $TMPDIR, and gating env vars
+      # (LOCALAI_COSIGN_LIVE, etc.) to skip live-network specs.
+      - path: _test\.go$
+        text: 'os\.(Getenv|LookupEnv|Environ)'
+        linters: [forbidigo]
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -31,6 +31,7 @@ LocalAI follows the Linux kernel project's [guidelines for AI coding assistants]
 | [.agents/debugging-backends.md](.agents/debugging-backends.md) | Debugging runtime backend failures, dependency conflicts, rebuilding backends |
 | [.agents/adding-gallery-models.md](.agents/adding-gallery-models.md) | Adding GGUF models from HuggingFace to the model gallery |
 | [.agents/localai-assistant-mcp.md](.agents/localai-assistant-mcp.md) | LocalAI Assistant chat modality — adding admin tools to the in-process MCP server, editing skill prompts, keeping REST + MCP + skills in sync |
+| [.agents/backend-signing.md](.agents/backend-signing.md) | Backend OCI image signing (keyless cosign + sigstore-go) — producer-side CI setup, consumer-side gallery `verification:` block, strict mode (`LOCALAI_REQUIRE_BACKEND_INTEGRITY`), revocation via `not_before` |

 ## Quick Reference

--- a/backend/cpp/ds4/Makefile
+++ b/backend/cpp/ds4/Makefile
@@ -1,10 +1,10 @@
 # ds4 backend Makefile.
 #
-# Upstream pin lives below as DS4_VERSION?=0cba357ca1bc0e7510421cc26888e420ea942123
+# Upstream pin lives below as DS4_VERSION?=c9dd9499bfa57c1bbfbb4446eff963330ab5329b
 # (.github/bump_deps.sh) can find and update it - matches the
 # llama-cpp / ik-llama-cpp / turboquant convention.

-DS4_VERSION?=0cba357ca1bc0e7510421cc26888e420ea942123
+DS4_VERSION?=c9dd9499bfa57c1bbfbb4446eff963330ab5329b
 DS4_REPO?=https://github.com/antirez/ds4

 CURRENT_MAKEFILE_DIR := $(dir $(abspath $(lastword $(MAKEFILE_LIST))))
--- a/backend/cpp/ik-llama-cpp/Makefile
+++ b/backend/cpp/ik-llama-cpp/Makefile
@@ -1,5 +1,5 @@

-IK_LLAMA_VERSION?=949bb8f1d660fc1264c137a6f3dbd619375f6134
+IK_LLAMA_VERSION?=c35189d83c91aad780aba62b89f2830cb2916223
 LLAMA_REPO?=https://github.com/ikawrakow/ik_llama.cpp

 CMAKE_ARGS?=
--- a/backend/cpp/llama-cpp/Makefile
+++ b/backend/cpp/llama-cpp/Makefile
@@ -1,5 +1,5 @@

-LLAMA_VERSION?=a9883db8ee021cf16783016a60996d41820b5195
+LLAMA_VERSION?=87589042cac2c390cec8d68fb2fad64e0a2a252a
 LLAMA_REPO?=https://github.com/ggerganov/llama.cpp

 CMAKE_ARGS?=
--- a/backend/cpp/llama-cpp/grpc-server.cpp
+++ b/backend/cpp/llama-cpp/grpc-server.cpp
@@ -32,6 +32,7 @@
 #include <grpcpp/health_check_service_interface.h>
 #include <grpcpp/security/server_credentials.h>
 #include <regex>
+#include <algorithm>
 #include <atomic>
 #include <cstdlib>
 #include <fstream>
@@ -450,6 +451,8 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
        // vector; the turboquant fork still uses the legacy scalar. The
        // LOCALAI_LEGACY_LLAMA_CPP_SPEC macro is injected by
        // backend/cpp/turboquant/patch-grpc-server.sh for fork builds only.
+        // Upstream renamed COMMON_SPECULATIVE_TYPE_DRAFT -> ..._DRAFT_SIMPLE
+        // in ggml-org/llama.cpp#22964; the fork still uses the old name.
 #ifdef LOCALAI_LEGACY_LLAMA_CPP_SPEC
        if (params.speculative.type == COMMON_SPECULATIVE_TYPE_NONE) {
            params.speculative.type = COMMON_SPECULATIVE_TYPE_DRAFT;
@@ -458,7 +461,7 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
        const bool no_spec_type = params.speculative.types.empty() ||
            (params.speculative.types.size() == 1 && params.speculative.types[0] == COMMON_SPECULATIVE_TYPE_NONE);
        if (no_spec_type) {
-            params.speculative.types = { COMMON_SPECULATIVE_TYPE_DRAFT };
+            params.speculative.types = { COMMON_SPECULATIVE_TYPE_DRAFT_SIMPLE };
        }
 #endif
    }
@@ -685,6 +688,136 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
                    // If conversion fails, keep default value (8)
                }
            }
+
+        // --- physical batch size (upstream -ub / --ubatch-size) ---
+        // Note: line ~482 already aliases n_ubatch to n_batch as a default; this
+        // option lets users decouple the two (useful for embeddings/rerank).
+        } else if (!strcmp(optname, "n_ubatch") || !strcmp(optname, "ubatch")) {
+            if (optval != NULL) {
+                try { params.n_ubatch = std::stoi(optval_str); } catch (...) {}
+            }
+
+        // --- main-model batch threads (upstream -tb / --threads-batch) ---
+        } else if (!strcmp(optname, "threads_batch") || !strcmp(optname, "n_threads_batch")) {
+            if (optval != NULL) {
+                try {
+                    int n = std::stoi(optval_str);
+                    if (n <= 0) n = (int)std::thread::hardware_concurrency();
+                    params.cpuparams_batch.n_threads = n;
+                } catch (...) {}
+            }
+
+        // --- pooling type for embeddings (upstream --pooling) ---
+        } else if (!strcmp(optname, "pooling_type") || !strcmp(optname, "pooling")) {
+            if (optval != NULL) {
+                if      (optval_str == "none") params.pooling_type = LLAMA_POOLING_TYPE_NONE;
+                else if (optval_str == "mean") params.pooling_type = LLAMA_POOLING_TYPE_MEAN;
+                else if (optval_str == "cls")  params.pooling_type = LLAMA_POOLING_TYPE_CLS;
+                else if (optval_str == "last") params.pooling_type = LLAMA_POOLING_TYPE_LAST;
+                else if (optval_str == "rank") params.pooling_type = LLAMA_POOLING_TYPE_RANK;
+                // unknown values silently leave UNSPECIFIED (auto-detect)
+            }
+
+        // --- llama log verbosity threshold (upstream -lv / --verbosity) ---
+        } else if (!strcmp(optname, "verbosity")) {
+            if (optval != NULL) {
+                try { params.verbosity = std::stoi(optval_str); } catch (...) {}
+            }
+
+        // --- O_DIRECT model loading (upstream --direct-io) ---
+        } else if (!strcmp(optname, "direct_io") || !strcmp(optname, "use_direct_io")) {
+            if (optval_str == "true" || optval_str == "1" || optval_str == "yes" || optval_str == "on" || optval_str == "enabled") {
+                params.use_direct_io = true;
+            } else if (optval_str == "false" || optval_str == "0" || optval_str == "no" || optval_str == "off" || optval_str == "disabled") {
+                params.use_direct_io = false;
+            }
+
+        // --- embedding normalization (upstream --embd-normalize) ---
+        // -1 none, 0 max-abs, 1 taxicab, 2 L2 (default), >2 p-norm
+        } else if (!strcmp(optname, "embd_normalize") || !strcmp(optname, "embedding_normalize")) {
+            if (optval != NULL) {
+                try { params.embd_normalize = std::stoi(optval_str); } catch (...) {}
+            }
+
+        // --- reasoning parser (upstream --reasoning-format) ---
+        // Picks the parser for <think> blocks emitted by reasoning models.
+        // none / auto / deepseek / deepseek-legacy
+        } else if (!strcmp(optname, "reasoning_format")) {
+            if (optval != NULL) {
+                if      (optval_str == "none")             params.reasoning_format = COMMON_REASONING_FORMAT_NONE;
+                else if (optval_str == "auto")             params.reasoning_format = COMMON_REASONING_FORMAT_AUTO;
+                else if (optval_str == "deepseek")         params.reasoning_format = COMMON_REASONING_FORMAT_DEEPSEEK;
+                else if (optval_str == "deepseek-legacy" || optval_str == "deepseek_legacy")
+                                                            params.reasoning_format = COMMON_REASONING_FORMAT_DEEPSEEK_LEGACY;
+                // unknown values silently keep the upstream default (DEEPSEEK)
+            }
+
+        // --- reasoning budget (upstream --reasoning-budget) ---
+        // -1 unlimited, 0 disabled, >0 token budget for thinking blocks.
+        // Distinct from per-request `enable_thinking` (chat_template_kwargs).
+        } else if (!strcmp(optname, "enable_reasoning") || !strcmp(optname, "reasoning_budget")) {
+            if (optval != NULL) {
+                try { params.enable_reasoning = std::stoi(optval_str); } catch (...) {}
+            }
+
+        // --- prefill assistant turn (upstream --no-prefill-assistant) ---
+        } else if (!strcmp(optname, "prefill_assistant")) {
+            if (optval_str == "true" || optval_str == "1" || optval_str == "yes" || optval_str == "on" || optval_str == "enabled") {
+                params.prefill_assistant = true;
+            } else if (optval_str == "false" || optval_str == "0" || optval_str == "no" || optval_str == "off" || optval_str == "disabled") {
+                params.prefill_assistant = false;
+            }
+
+        // --- mmproj GPU offload (upstream --no-mmproj-offload, inverted) ---
+        } else if (!strcmp(optname, "mmproj_use_gpu") || !strcmp(optname, "mmproj_offload")) {
+            if (optval_str == "true" || optval_str == "1" || optval_str == "yes" || optval_str == "on" || optval_str == "enabled") {
+                params.mmproj_use_gpu = true;
+            } else if (optval_str == "false" || optval_str == "0" || optval_str == "no" || optval_str == "off" || optval_str == "disabled") {
+                params.mmproj_use_gpu = false;
+            }
+
+        // --- per-image vision token budget (upstream --image-min/max-tokens) ---
+        } else if (!strcmp(optname, "image_min_tokens")) {
+            if (optval != NULL) {
+                try { params.image_min_tokens = std::stoi(optval_str); } catch (...) {}
+            }
+        } else if (!strcmp(optname, "image_max_tokens")) {
+            if (optval != NULL) {
+                try { params.image_max_tokens = std::stoi(optval_str); } catch (...) {}
+            }
+
+        // --- main-model tensor buffer overrides (upstream --override-tensor) ---
+        // Format: <tensor regex>=<buffer type>,<tensor regex>=<buffer type>,...
+        // Mirrors the existing `draft_override_tensor` parser below.
+        } else if (!strcmp(optname, "override_tensor") || !strcmp(optname, "tensor_buft_overrides")) {
+            ggml_backend_load_all();
+            std::map<std::string, ggml_backend_buffer_type_t> buft_list;
+            for (size_t i = 0; i < ggml_backend_dev_count(); ++i) {
+                auto * dev = ggml_backend_dev_get(i);
+                auto * buft = ggml_backend_dev_buffer_type(dev);
+                if (buft) {
+                    buft_list[ggml_backend_buft_name(buft)] = buft;
+                }
+            }
+            static std::list<std::string> override_names;
+            std::string cur;
+            auto flush = [&](const std::string & spec) {
+                auto pos = spec.find('=');
+                if (pos == std::string::npos) return;
+                const std::string name = spec.substr(0, pos);
+                const std::string type = spec.substr(pos + 1);
+                auto it = buft_list.find(type);
+                if (it == buft_list.end()) return; // unknown buffer type: ignore
+                override_names.push_back(name);
+                params.tensor_buft_overrides.push_back(
+                    {override_names.back().c_str(), it->second});
+            };
+            for (char c : optval_str) {
+                if (c == ',') { if (!cur.empty()) { flush(cur); cur.clear(); } }
+                else { cur.push_back(c); }
+            }
+            if (!cur.empty()) flush(cur);
+
        // Speculative decoding options
        } else if (!strcmp(optname, "spec_type") || !strcmp(optname, "speculative_type")) {
 #ifdef LOCALAI_LEGACY_LLAMA_CPP_SPEC
@@ -701,16 +834,27 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
            // Upstream switched to a vector of types (comma-separated for multi-type
            // chaining via common_speculative_types_from_names). We keep accepting a
            // single value here, but also tolerate comma-separated lists.
+            //
+            // ggml-org/llama.cpp#22964 also renamed the registered names from
+            // underscore- to dash-separated form, and replaced the bare
+            // `draft`/`eagle3` aliases with `draft-simple`/`draft-eagle3`. We
+            // normalize each token here so existing model configs keep working.
+            auto normalize_spec_name = [](std::string s) -> std::string {
+                std::replace(s.begin(), s.end(), '_', '-');
+                if (s == "draft")  return "draft-simple";
+                if (s == "eagle3") return "draft-eagle3";
+                return s;
+            };
            std::vector<std::string> names;
            std::string item;
            for (char c : optval_str) {
                if (c == ',') {
-                    if (!item.empty()) { names.push_back(item); item.clear(); }
+                    if (!item.empty()) { names.push_back(normalize_spec_name(item)); item.clear(); }
                } else {
                    item.push_back(c);
                }
            }
-            if (!item.empty()) names.push_back(item);
+            if (!item.empty()) names.push_back(normalize_spec_name(item));
            auto parsed = common_speculative_types_from_names(names);
            if (!parsed.empty()) {
                params.speculative.types = parsed;
@@ -2794,7 +2938,9 @@ public:
            }
        }

-        int embd_normalize = 2; // default to Euclidean/L2 norm
+        // Honor the load-time embd_normalize set via options:embd_normalize.
+        // -1 none, 0 max-abs, 1 taxicab, 2 L2 (default), >2 p-norm.
+        int embd_normalize = params_base.embd_normalize;
        // create and queue the task
        auto rd = ctx_server.get_response_reader();
        {
--- a/backend/go/stablediffusion-ggml/Makefile
+++ b/backend/go/stablediffusion-ggml/Makefile
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)

 # stablediffusion.cpp (ggml)
 STABLEDIFFUSION_GGML_REPO?=https://github.com/leejet/stable-diffusion.cpp
-STABLEDIFFUSION_GGML_VERSION?=90e87bc846f17059771efb8aaa31e9ef0cab6f78
+STABLEDIFFUSION_GGML_VERSION?=bd17f53b7386fb5f60e8587b75e73c4b2fed3426

 CMAKE_ARGS+=-DGGML_MAX_NAME=128

--- a/backend/go/whisper/Makefile
+++ b/backend/go/whisper/Makefile
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)

 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggml-org/whisper.cpp
-WHISPER_CPP_VERSION?=3e9b7d0fef3528ee2208da3cdb873a2c53d2ae2f
+WHISPER_CPP_VERSION?=968eebe77225d25e57a3f981da7c696310f0e881
 SO_TARGET?=libgowhisper.so

 CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
--- a/backend/python/transformers/requirements.txt
+++ b/backend/python/transformers/requirements.txt
@@ -3,4 +3,4 @@ protobuf==6.33.5
 certifi
 setuptools
 scipy==1.15.1
-numpy>=2.0.0
+numpy>=2.4.6
--- a/backend/python/vllm/requirements-cublas13-after.txt
+++ b/backend/python/vllm/requirements-cublas13-after.txt
@@ -3,5 +3,5 @@
 # on a cu130 host. Pull the cu130-flavoured wheel from vLLM's per-tag index
 # instead — the cublas13 case in install.sh adds --index-strategy=unsafe-best-match
 # so uv consults this index alongside PyPI.
--extra-index-url https://wheels.vllm.ai/0.20.2/cu130
-vllm==0.20.2
+--extra-index-url https://wheels.vllm.ai/0.21.0/cu130
+vllm==0.21.0
--- a/core/application/startup.go
+++ b/core/application/startup.go
@@ -212,12 +212,12 @@ func New(opts ...config.AppOption) (*Application, error) {
 		}
 	}

-	if err := coreStartup.InstallModels(options.Context, application.GalleryService(), options.Galleries, options.BackendGalleries, options.SystemState, application.ModelLoader(), options.EnforcePredownloadScans, options.AutoloadBackendGalleries, nil, options.ModelsURL...); err != nil {
+	if err := coreStartup.InstallModels(options.Context, application.GalleryService(), options.Galleries, options.BackendGalleries, options.SystemState, application.ModelLoader(), options.EnforcePredownloadScans, options.AutoloadBackendGalleries, options.RequireBackendIntegrity, nil, options.ModelsURL...); err != nil {
 		xlog.Error("error installing models", "error", err)
 	}

 	for _, backend := range options.ExternalBackends {
-		if err := galleryop.InstallExternalBackend(options.Context, options.BackendGalleries, options.SystemState, application.ModelLoader(), nil, backend, "", ""); err != nil {
+		if err := galleryop.InstallExternalBackend(options.Context, options.BackendGalleries, options.SystemState, application.ModelLoader(), nil, backend, "", "", options.RequireBackendIntegrity); err != nil {
 			xlog.Error("error installing external backend", "error", err)
 		}
 	}
@@ -267,13 +267,13 @@ func New(opts ...config.AppOption) (*Application, error) {
 	}

 	if options.PreloadJSONModels != "" {
-		if err := galleryop.ApplyGalleryFromString(options.SystemState, application.ModelLoader(), options.EnforcePredownloadScans, options.AutoloadBackendGalleries, options.Galleries, options.BackendGalleries, options.PreloadJSONModels); err != nil {
+		if err := galleryop.ApplyGalleryFromString(options.SystemState, application.ModelLoader(), options.EnforcePredownloadScans, options.AutoloadBackendGalleries, options.Galleries, options.BackendGalleries, options.PreloadJSONModels, options.RequireBackendIntegrity); err != nil {
 			return nil, err
 		}
 	}

 	if options.PreloadModelsFromPath != "" {
-		if err := galleryop.ApplyGalleryFromFile(options.SystemState, application.ModelLoader(), options.EnforcePredownloadScans, options.AutoloadBackendGalleries, options.Galleries, options.BackendGalleries, options.PreloadModelsFromPath); err != nil {
+		if err := galleryop.ApplyGalleryFromFile(options.SystemState, application.ModelLoader(), options.EnforcePredownloadScans, options.AutoloadBackendGalleries, options.Galleries, options.BackendGalleries, options.PreloadModelsFromPath, options.RequireBackendIntegrity); err != nil {
 			return nil, err
 		}
 	}
--- a/core/application/upgrade_checker.go
+++ b/core/application/upgrade_checker.go
@@ -217,7 +217,7 @@ func (uc *UpgradeChecker) runCheck(ctx context.Context) {
 				err = bm.UpgradeBackend(ctx, name, nil)
 			} else {
 				err = gallery.UpgradeBackend(ctx, uc.systemState, uc.modelLoader,
-					uc.galleries, name, nil)
+					uc.galleries, name, nil, uc.appConfig.RequireBackendIntegrity)
 			}
 			if err != nil {
 				xlog.Error("Failed to auto-upgrade backend",
--- a/core/backend/llm.go
+++ b/core/backend/llm.go
@@ -86,7 +86,7 @@ func ModelInference(ctx context.Context, s string, messages schema.Messages, ima
 		if !slices.Contains(modelNames, modelName) {
 			utils.ResetDownloadTimers()
 			// if we failed to load the model, we try to download it
-			err := gallery.InstallModelFromGallery(ctx, o.Galleries, o.BackendGalleries, o.SystemState, loader, modelName, gallery.GalleryModel{}, utils.DisplayDownloadFunction, o.EnforcePredownloadScans, o.AutoloadBackendGalleries)
+			err := gallery.InstallModelFromGallery(ctx, o.Galleries, o.BackendGalleries, o.SystemState, loader, modelName, gallery.GalleryModel{}, utils.DisplayDownloadFunction, o.EnforcePredownloadScans, o.AutoloadBackendGalleries, o.RequireBackendIntegrity)
 			if err != nil {
 				xlog.Error("failed to install model from gallery", "error", err, "model", modelFile)
 				//return nil, err
--- a/core/cli/backends.go
+++ b/core/cli/backends.go
@@ -17,9 +17,10 @@ import (
 )

 type BackendsCMDFlags struct {
-	BackendGalleries   string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"backends" default:"${backends}"`
-	BackendsPath       string `env:"LOCALAI_BACKENDS_PATH,BACKENDS_PATH" type:"path" default:"${basepath}/backends" help:"Path containing backends used for inferencing" group:"storage"`
-	BackendsSystemPath string `env:"LOCALAI_BACKENDS_SYSTEM_PATH,BACKEND_SYSTEM_PATH" type:"path" default:"/var/lib/local-ai/backends" help:"Path containing system backends used for inferencing" group:"backends"`
+	BackendGalleries        string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"backends" default:"${backends}"`
+	BackendsPath            string `env:"LOCALAI_BACKENDS_PATH,BACKENDS_PATH" type:"path" default:"${basepath}/backends" help:"Path containing backends used for inferencing" group:"storage"`
+	BackendsSystemPath      string `env:"LOCALAI_BACKENDS_SYSTEM_PATH,BACKEND_SYSTEM_PATH" type:"path" default:"/var/lib/local-ai/backends" help:"Path containing system backends used for inferencing" group:"backends"`
+	RequireBackendIntegrity bool   `env:"LOCALAI_REQUIRE_BACKEND_INTEGRITY,REQUIRE_BACKEND_INTEGRITY" help:"If true, reject backend installs without a configured signature verification policy (OCI URIs) or SHA256 (tarball/HTTP URIs)." group:"hardening" default:"false"`
 }

 type BackendsList struct {
@@ -126,7 +127,7 @@ func (bi *BackendsInstall) Run(ctx *cliContext.Context) error {
 	}

 	modelLoader := model.NewModelLoader(systemState)
-	err = galleryop.InstallExternalBackend(context.Background(), galleries, systemState, modelLoader, progressCallback, bi.BackendArgs, bi.Name, bi.Alias)
+	err = galleryop.InstallExternalBackend(context.Background(), galleries, systemState, modelLoader, progressCallback, bi.BackendArgs, bi.Name, bi.Alias, bi.RequireBackendIntegrity)
 	if err != nil {
 		return err
 	}
@@ -197,7 +198,7 @@ func (bu *BackendsUpgrade) Run(ctx *cliContext.Context) error {
 			}
 		}

-		if err := gallery.UpgradeBackend(context.Background(), systemState, modelLoader, galleries, name, progressCallback); err != nil {
+		if err := gallery.UpgradeBackend(context.Background(), systemState, modelLoader, galleries, name, progressCallback, bu.RequireBackendIntegrity); err != nil {
 			fmt.Printf("Failed to upgrade %s: %v\n", name, err)
 		} else {
 			fmt.Printf("Backend %s upgraded successfully\n", name)
--- a/core/cli/models.go
+++ b/core/cli/models.go
@@ -32,6 +32,7 @@ type ModelsList struct {

 type ModelsInstall struct {
 	DisablePredownloadScan   bool     `env:"LOCALAI_DISABLE_PREDOWNLOAD_SCAN" help:"If true, disables the best-effort security scanner before downloading any files." group:"hardening" default:"false"`
+	RequireBackendIntegrity  bool     `env:"LOCALAI_REQUIRE_BACKEND_INTEGRITY,REQUIRE_BACKEND_INTEGRITY" help:"If true, reject backend installs without a configured signature verification policy (OCI URIs) or SHA256 (tarball/HTTP URIs)." group:"hardening" default:"false"`
 	AutoloadBackendGalleries bool     `env:"LOCALAI_AUTOLOAD_BACKEND_GALLERIES" help:"If true, automatically loads backend galleries" group:"backends" default:"true"`
 	ModelArgs                []string `arg:"" optional:"" name:"models" help:"Model configuration URLs to load"`

@@ -71,7 +72,6 @@ func (ml *ModelsList) Run(ctx *cliContext.Context) error {
 }

 func (mi *ModelsInstall) Run(ctx *cliContext.Context) error {
-
 	systemState, err := system.GetSystemState(
 		system.WithModelPath(mi.ModelsPath),
 		system.WithBackendPath(mi.BackendsPath),
@@ -135,7 +135,7 @@ func (mi *ModelsInstall) Run(ctx *cliContext.Context) error {
 		}

 		modelLoader := model.NewModelLoader(systemState)
-		err = startup.InstallModels(context.Background(), galleryService, galleries, backendGalleries, systemState, modelLoader, !mi.DisablePredownloadScan, mi.AutoloadBackendGalleries, progressCallback, modelName)
+		err = startup.InstallModels(context.Background(), galleryService, galleries, backendGalleries, systemState, modelLoader, !mi.DisablePredownloadScan, mi.AutoloadBackendGalleries, mi.RequireBackendIntegrity, progressCallback, modelName)
 		if err != nil {
 			return err
 		}
--- a/core/cli/run.go
+++ b/core/cli/run.go
@@ -67,6 +67,7 @@ type RunCMD struct {
 	OllamaAPIRootEndpoint              bool     `env:"LOCALAI_OLLAMA_API_ROOT_ENDPOINT" default:"false" help:"Register Ollama-compatible health check on / (replaces web UI on root path). The /api/* Ollama endpoints are always available regardless of this flag" group:"api"`
 	DisableRuntimeSettings             bool     `env:"LOCALAI_DISABLE_RUNTIME_SETTINGS,DISABLE_RUNTIME_SETTINGS" default:"false" help:"Disables the runtime settings. When set to true, the server will not load the runtime settings from the runtime_settings.json file" group:"api"`
 	DisablePredownloadScan             bool     `env:"LOCALAI_DISABLE_PREDOWNLOAD_SCAN" help:"If true, disables the best-effort security scanner before downloading any files." group:"hardening" default:"false"`
+	RequireBackendIntegrity            bool     `env:"LOCALAI_REQUIRE_BACKEND_INTEGRITY,REQUIRE_BACKEND_INTEGRITY" help:"If true, backend installs without a configured signature verification policy (for OCI URIs) or SHA256 (for tarball/HTTP URIs) are rejected. Default is to warn and install. Set this in production once your gallery's verification: block is populated." group:"hardening" default:"false"`
 	OpaqueErrors                       bool     `env:"LOCALAI_OPAQUE_ERRORS" default:"false" help:"If true, all error responses are replaced with blank 500 errors. This is intended only for hardening against information leaks and is normally not recommended." group:"hardening"`
 	UseSubtleKeyComparison             bool     `env:"LOCALAI_SUBTLE_KEY_COMPARISON" default:"false" help:"If true, API Key validation comparisons will be performed using constant-time comparisons rather than simple equality. This trades off performance on each request for resiliancy against timing attacks." group:"hardening"`
 	DisableApiKeyRequirementForHttpGet bool     `env:"LOCALAI_DISABLE_API_KEY_REQUIREMENT_FOR_HTTP_GET" default:"false" help:"If true, a valid API key is not required to issue GET requests to portions of the web ui. This should only be enabled in secure testing environments" group:"hardening"`
@@ -503,6 +504,10 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
 		opts = append(opts, config.WithAutoUpgradeBackends(r.AutoUpgradeBackends))
 	}

+	if r.RequireBackendIntegrity {
+		opts = append(opts, config.WithRequireBackendIntegrity(r.RequireBackendIntegrity))
+	}
+
 	if r.PreferDevelopmentBackends {
 		opts = append(opts, config.WithPreferDevelopmentBackends(r.PreferDevelopmentBackends))
 	}
--- a/core/cli/worker/worker.go
+++ b/core/cli/worker/worker.go
@@ -1,10 +1,11 @@
 package worker

 type WorkerFlags struct {
-	BackendsPath       string `env:"LOCALAI_BACKENDS_PATH,BACKENDS_PATH" type:"path" default:"${basepath}/backends" help:"Path containing backends used for inferencing" group:"backends"`
-	BackendGalleries   string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"backends" default:"${backends}"`
-	BackendsSystemPath string `env:"LOCALAI_BACKENDS_SYSTEM_PATH,BACKEND_SYSTEM_PATH" type:"path" default:"/var/lib/local-ai/backends" help:"Path containing system backends used for inferencing" group:"backends"`
-	ExtraLLamaCPPArgs  string `name:"llama-cpp-args" env:"LOCALAI_EXTRA_LLAMA_CPP_ARGS,EXTRA_LLAMA_CPP_ARGS" help:"Extra arguments to pass to llama-cpp-rpc-server"`
+	BackendsPath            string `env:"LOCALAI_BACKENDS_PATH,BACKENDS_PATH" type:"path" default:"${basepath}/backends" help:"Path containing backends used for inferencing" group:"backends"`
+	BackendGalleries        string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"backends" default:"${backends}"`
+	BackendsSystemPath      string `env:"LOCALAI_BACKENDS_SYSTEM_PATH,BACKEND_SYSTEM_PATH" type:"path" default:"/var/lib/local-ai/backends" help:"Path containing system backends used for inferencing" group:"backends"`
+	RequireBackendIntegrity bool   `env:"LOCALAI_REQUIRE_BACKEND_INTEGRITY,REQUIRE_BACKEND_INTEGRITY" help:"If true, reject backend installs without a configured signature verification policy (OCI URIs) or SHA256 (tarball/HTTP URIs)." group:"hardening" default:"false"`
+	ExtraLLamaCPPArgs       string `name:"llama-cpp-args" env:"LOCALAI_EXTRA_LLAMA_CPP_ARGS,EXTRA_LLAMA_CPP_ARGS" help:"Extra arguments to pass to llama-cpp-rpc-server"`
 }

 type Worker struct {
--- a/core/cli/worker/worker_backend_common.go
+++ b/core/cli/worker/worker_backend_common.go
@@ -18,7 +18,7 @@ import (
 // installing the backend from the gallery if it isn't present.
 // `name` is the gallery entry name (for vLLM the meta entry "vllm"
 // resolves to a platform-specific package via capability lookup).
-func findBackendPath(name, galleries string, systemState *system.SystemState) (string, error) {
+func findBackendPath(name, galleries string, systemState *system.SystemState, requireIntegrity bool) (string, error) {
 	backends, err := gallery.ListSystemBackends(systemState)
 	if err != nil {
 		return "", err
@@ -33,7 +33,7 @@ func findBackendPath(name, galleries string, systemState *system.SystemState) (s
 		xlog.Error("failed loading galleries", "error", err)
 		return "", err
 	}
-	if err := gallery.InstallBackendFromGallery(context.Background(), gals, systemState, ml, name, nil, true); err != nil {
+	if err := gallery.InstallBackendFromGallery(context.Background(), gals, systemState, ml, name, nil, true, requireIntegrity); err != nil {
 		xlog.Error("backend not found, failed to install it", "name", name, "error", err)
 		return "", err
 	}
--- a/core/cli/worker/worker_llamacpp.go
+++ b/core/cli/worker/worker_llamacpp.go
@@ -27,7 +27,7 @@ const (
 	llamaCPPGalleryName   = "llama-cpp"
 )

-func findLLamaCPPBackend(galleries string, systemState *system.SystemState) (string, error) {
+func findLLamaCPPBackend(galleries string, systemState *system.SystemState, requireIntegrity bool) (string, error) {
 	backends, err := gallery.ListSystemBackends(systemState)
 	if err != nil {
 		xlog.Warn("Failed listing system backends", "error", err)
@@ -43,7 +43,7 @@ func findLLamaCPPBackend(galleries string, systemState *system.SystemState) (str
 			xlog.Error("failed loading galleries", "error", err)
 			return "", err
 		}
-		err := gallery.InstallBackendFromGallery(context.Background(), gals, systemState, ml, llamaCPPGalleryName, nil, true)
+		err := gallery.InstallBackendFromGallery(context.Background(), gals, systemState, ml, llamaCPPGalleryName, nil, true, requireIntegrity)
 		if err != nil {
 			xlog.Error("llama-cpp backend not found, failed to install it", "error", err)
 			return "", err
@@ -76,7 +76,7 @@ func (r *LLamaCPP) Run(ctx *cliContext.Context) error {
 	if err != nil {
 		return err
 	}
-	grpcProcess, err := findLLamaCPPBackend(r.BackendGalleries, systemState)
+	grpcProcess, err := findLLamaCPPBackend(r.BackendGalleries, systemState, r.RequireBackendIntegrity)
 	if err != nil {
 		return err
 	}
--- a/core/cli/worker/worker_mlx_common.go
+++ b/core/cli/worker/worker_mlx_common.go
@@ -9,8 +9,8 @@ import (

 const mlxDistributedGalleryName = "mlx-distributed"

-func findMLXDistributedBackendPath(galleries string, systemState *system.SystemState) (string, error) {
-	return findBackendPath(mlxDistributedGalleryName, galleries, systemState)
+func findMLXDistributedBackendPath(galleries string, systemState *system.SystemState, requireIntegrity bool) (string, error) {
+	return findBackendPath(mlxDistributedGalleryName, galleries, systemState, requireIntegrity)
 }

 // buildMLXCommand builds the exec.Cmd to launch the mlx-distributed backend.
--- a/core/cli/worker/worker_mlx_distributed.go
+++ b/core/cli/worker/worker_mlx_distributed.go
@@ -28,7 +28,7 @@ func (r *MLXDistributed) Run(ctx *cliContext.Context) error {
 		return err
 	}

-	backendPath, err := findMLXDistributedBackendPath(r.BackendGalleries, systemState)
+	backendPath, err := findMLXDistributedBackendPath(r.BackendGalleries, systemState, r.RequireBackendIntegrity)
 	if err != nil {
 		return fmt.Errorf("cannot find mlx-distributed backend: %w", err)
 	}
--- a/core/cli/worker/worker_p2p.go
+++ b/core/cli/worker/worker_p2p.go
@@ -73,7 +73,7 @@ func (r *P2P) Run(ctx *cliContext.Context) error {
 			for {
 				xlog.Info("Starting llama-cpp-rpc-server", "address", address, "port", port)

-				grpcProcess, err := findLLamaCPPBackend(r.BackendGalleries, systemState)
+				grpcProcess, err := findLLamaCPPBackend(r.BackendGalleries, systemState, r.RequireBackendIntegrity)
 				if err != nil {
 					xlog.Error("Failed to find llama-cpp-rpc-server", "error", err)
 					return
--- a/core/cli/worker/worker_p2p_mlx.go
+++ b/core/cli/worker/worker_p2p_mlx.go
@@ -48,7 +48,7 @@ func (r *P2PMLX) Run(ctx *cliContext.Context) error {
 	c, cancel := context.WithCancel(context.Background())
 	defer cancel()

-	backendPath, err := findMLXDistributedBackendPath(r.BackendGalleries, systemState)
+	backendPath, err := findMLXDistributedBackendPath(r.BackendGalleries, systemState, r.RequireBackendIntegrity)
 	if err != nil {
 		xlog.Warn("Could not find mlx-distributed backend from gallery, will try backend.py directly", "error", err)
 	}
--- a/core/cli/worker/worker_vllm.go
+++ b/core/cli/worker/worker_vllm.go
@@ -77,7 +77,7 @@ func (r *VLLMDistributed) Run(ctx *cliContext.Context) error {
 		return fmt.Errorf("getting system state: %w", err)
 	}

-	backendPath, err := findBackendPath("vllm", r.BackendGalleries, systemState)
+	backendPath, err := findBackendPath("vllm", r.BackendGalleries, systemState, r.RequireBackendIntegrity)
 	if err != nil {
 		return fmt.Errorf("cannot find vllm backend: %w", err)
 	}
--- a/core/config/application_config.go
+++ b/core/config/application_config.go
@@ -60,6 +60,13 @@ type ApplicationConfig struct {
 	AutoUpgradeBackends                         bool
 	PreferDevelopmentBackends                   bool

+	// RequireBackendIntegrity promotes a missing SHA256 (tarball/HTTP URIs)
+	// or missing verification policy (OCI URIs) from a warning to a hard
+	// failure during backend install/upgrade. Off by default to keep
+	// upgrades non-breaking; operators opt in explicitly via
+	// --require-backend-integrity / LOCALAI_REQUIRE_BACKEND_INTEGRITY.
+	RequireBackendIntegrity bool
+
 	SingleBackend           bool // Deprecated: use MaxActiveBackends = 1 instead
 	MaxActiveBackends       int  // Maximum number of active backends (0 = unlimited, 1 = single backend mode)
 	WatchDogIdle bool
@@ -436,6 +443,10 @@ func WithAutoUpgradeBackends(v bool) AppOption {
 	return func(o *ApplicationConfig) { o.AutoUpgradeBackends = v }
 }

+func WithRequireBackendIntegrity(v bool) AppOption {
+	return func(o *ApplicationConfig) { o.RequireBackendIntegrity = v }
+}
+
 func WithPreferDevelopmentBackends(v bool) AppOption {
 	return func(o *ApplicationConfig) { o.PreferDevelopmentBackends = v }
 }
--- a/core/config/gallery.go
+++ b/core/config/gallery.go
@@ -1,6 +1,37 @@
 package config

-type Gallery struct {
-	URL  string `json:"url" yaml:"url"`
-	Name string `json:"name" yaml:"name"`
+// GalleryVerification declares the keyless-cosign signature policy that
+// every OCI backend image fetched from this gallery must satisfy.
+//
+// Verification is opt-in: galleries without a Verification block install
+// backends with no signature check (the downloader logs a warning when
+// LOCALAI_REQUIRE_BACKEND_INTEGRITY is unset; that flag turns the warning
+// into a hard error).
+//
+// Identity matching: set Issuer (exact) or IssuerRegex, AND Identity
+// (exact) or IdentityRegex. For GitHub Actions keyless signing the
+// typical shape is:
+//
+//	verification:
+//	  issuer: "https://token.actions.githubusercontent.com"
+//	  identity_regex: "^https://github\\.com/mudler/local-ai-backends/\\.github/workflows/build\\.yaml@refs/heads/master$"
+//	  not_before: "2026-05-01T00:00:00Z"
+//
+// NotBefore is the revocation lever: advance it to invalidate every
+// signature produced before a known compromise window. Keyless cosign
+// certs are ephemeral so there is no CA-side revocation.
+type GalleryVerification struct {
+	Issuer        string `json:"issuer,omitempty" yaml:"issuer,omitempty"`
+	IssuerRegex   string `json:"issuer_regex,omitempty" yaml:"issuer_regex,omitempty"`
+	Identity      string `json:"identity,omitempty" yaml:"identity,omitempty"`
+	IdentityRegex string `json:"identity_regex,omitempty" yaml:"identity_regex,omitempty"`
+
+	// NotBefore is an RFC3339 timestamp. Empty disables the time check.
+	NotBefore string `json:"not_before,omitempty" yaml:"not_before,omitempty"`
+}
+
+type Gallery struct {
+	URL          string               `json:"url" yaml:"url"`
+	Name         string               `json:"name" yaml:"name"`
+	Verification *GalleryVerification `json:"verification,omitempty" yaml:"verification,omitempty"`
 }
--- a/core/config/gguf.go
+++ b/core/config/gguf.go
@@ -54,6 +54,13 @@ func guessGGUFFromFile(cfg *ModelConfig, f *gguf.GGUFFile, defaultCtx int) {
 		cfg.modelTemplate = chatTemplate.ValueString()
 	}

+	// Auto-enable Multi-Token Prediction (ggml-org/llama.cpp#22673) when the
+	// GGUF carries an embedded MTP head. Skipped silently for non-MTP models
+	// and when the user already configured a spec_type.
+	if n, ok := HasEmbeddedMTPHead(f); ok {
+		ApplyMTPDefaults(cfg, n)
+	}
+
 	// Thinking support detection is done after model load via DetectThinkingSupportFromBackend

 	// template estimations
--- a/core/config/mtp.go
+++ b/core/config/mtp.go
@@ -0,0 +1,84 @@
+package config
+
+import (
+	"strings"
+
+	gguf "github.com/gpustack/gguf-parser-go"
+	"github.com/mudler/xlog"
+)
+
+// mtpSpecOptions lists the speculative-decoding option keys auto-applied when
+// an MTP head is detected on a llama-cpp GGUF. Defaults track the upstream
+// MTP PR (ggml-org/llama.cpp#22673):
+//
+//   - spec_type:draft-mtp      activates Multi-Token Prediction
+//   - spec_n_max:6             draft window
+//   - spec_p_min:0.75          pinned because upstream marked the 0.75 default
+//     with a "change to 0.0f" TODO; locking it here keeps acceptance
+//     thresholds stable across future bumps
+var mtpSpecOptions = []string{
+	"spec_type:draft-mtp",
+	"spec_n_max:6",
+	"spec_p_min:0.75",
+}
+
+// MTPSpecOptions returns a copy of the option keys auto-applied when an MTP
+// head is detected. Exported for testing and for the importer.
+func MTPSpecOptions() []string {
+	out := make([]string, len(mtpSpecOptions))
+	copy(out, mtpSpecOptions)
+	return out
+}
+
+// HasEmbeddedMTPHead reports whether the parsed GGUF declares a Multi-Token
+// Prediction head. Detection reads `<arch>.nextn_predict_layers`, which is
+// what `gguf_writer.add_nextn_predict_layers(n)` emits in upstream's
+// `conversion/qwen.py` MTP mixin. A positive layer count means the head is
+// present in the same GGUF as the trunk.
+func HasEmbeddedMTPHead(f *gguf.GGUFFile) (uint32, bool) {
+	if f == nil {
+		return 0, false
+	}
+	arch := f.Architecture().Architecture
+	if arch == "" {
+		return 0, false
+	}
+	v, ok := f.Header.MetadataKV.Get(arch + ".nextn_predict_layers")
+	if !ok {
+		return 0, false
+	}
+	n := gguf.ValueNumeric[uint32](v)
+	return n, n > 0
+}
+
+// hasSpecTypeOption returns true when the slice already contains a
+// user-configured `spec_type:` / `speculative_type:` entry. Used to avoid
+// clobbering an explicit choice with the MTP auto-defaults.
+func hasSpecTypeOption(opts []string) bool {
+	for _, o := range opts {
+		if strings.HasPrefix(o, "spec_type:") || strings.HasPrefix(o, "speculative_type:") {
+			return true
+		}
+	}
+	return false
+}
+
+// ApplyMTPDefaults appends the auto-MTP option keys to cfg.Options when none
+// is already configured. It is a no-op when the user already picked a
+// `spec_type` (either via YAML or via the importer's preferences flow).
+//
+// `layers` is the value read from `<arch>.nextn_predict_layers` and is only
+// used for the diagnostic log line.
+func ApplyMTPDefaults(cfg *ModelConfig, layers uint32) {
+	if cfg == nil {
+		return
+	}
+	if hasSpecTypeOption(cfg.Options) {
+		xlog.Debug("[mtp] embedded MTP head detected but spec_type already configured; leaving user choice intact",
+			"name", cfg.Name, "nextn_layers", layers)
+		return
+	}
+	cfg.Options = append(cfg.Options, mtpSpecOptions...)
+	xlog.Info("[mtp] embedded MTP head detected; enabling draft-mtp speculative decoding",
+		"name", cfg.Name, "nextn_layers", layers, "spec_n_max", 6, "spec_p_min", 0.75)
+}
--- a/core/config/mtp_test.go
+++ b/core/config/mtp_test.go
@@ -0,0 +1,86 @@
+package config_test
+
+import (
+	. "github.com/mudler/LocalAI/core/config"
+
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+var _ = Describe("MTP auto-defaults", func() {
+	Context("MTPSpecOptions", func() {
+		It("returns the upstream-recommended speculative tuple", func() {
+			Expect(MTPSpecOptions()).To(Equal([]string{
+				"spec_type:draft-mtp",
+				"spec_n_max:6",
+				"spec_p_min:0.75",
+			}))
+		})
+
+		It("returns a defensive copy so callers cannot mutate the package default", func() {
+			opts := MTPSpecOptions()
+			opts[0] = "spec_type:none"
+			Expect(MTPSpecOptions()[0]).To(Equal("spec_type:draft-mtp"))
+		})
+	})
+
+	Context("ApplyMTPDefaults", func() {
+		It("appends MTP options when nothing is configured", func() {
+			cfg := &ModelConfig{Name: "qwen-mtp"}
+			ApplyMTPDefaults(cfg, 1)
+			Expect(cfg.Options).To(Equal([]string{
+				"spec_type:draft-mtp",
+				"spec_n_max:6",
+				"spec_p_min:0.75",
+			}))
+		})
+
+		It("preserves unrelated options already on the config", func() {
+			cfg := &ModelConfig{
+				Name:    "qwen-mtp",
+				Options: []string{"use_jinja:true", "cache_reuse:256"},
+			}
+			ApplyMTPDefaults(cfg, 1)
+			Expect(cfg.Options).To(Equal([]string{
+				"use_jinja:true",
+				"cache_reuse:256",
+				"spec_type:draft-mtp",
+				"spec_n_max:6",
+				"spec_p_min:0.75",
+			}))
+		})
+
+		It("is a no-op when the user already configured spec_type", func() {
+			cfg := &ModelConfig{
+				Name:    "qwen-mtp",
+				Options: []string{"spec_type:ngram-simple", "use_jinja:true"},
+			}
+			ApplyMTPDefaults(cfg, 1)
+			Expect(cfg.Options).To(Equal([]string{
+				"spec_type:ngram-simple",
+				"use_jinja:true",
+			}))
+		})
+
+		It("also respects the legacy speculative_type alias", func() {
+			cfg := &ModelConfig{
+				Name:    "qwen-mtp",
+				Options: []string{"speculative_type:ngram-mod"},
+			}
+			ApplyMTPDefaults(cfg, 1)
+			Expect(cfg.Options).To(Equal([]string{"speculative_type:ngram-mod"}))
+		})
+
+		It("tolerates a nil config", func() {
+			Expect(func() { ApplyMTPDefaults(nil, 1) }).ToNot(Panic())
+		})
+	})
+
+	Context("HasEmbeddedMTPHead", func() {
+		It("returns false on a nil GGUF file", func() {
+			n, ok := HasEmbeddedMTPHead(nil)
+			Expect(ok).To(BeFalse())
+			Expect(n).To(BeZero())
+		})
+	})
+})
--- a/core/gallery/backends.go
+++ b/core/gallery/backends.go
@@ -16,6 +16,7 @@ import (
 	"github.com/mudler/LocalAI/pkg/downloader"
 	"github.com/mudler/LocalAI/pkg/model"
 	"github.com/mudler/LocalAI/pkg/oci"
+	"github.com/mudler/LocalAI/pkg/oci/cosignverify"
 	"github.com/mudler/LocalAI/pkg/system"
 	"github.com/mudler/xlog"
 	cp "github.com/otiai10/copy"
@@ -102,8 +103,81 @@ func writeBackendMetadata(backendPath string, metadata *BackendMetadata) error {
 	return nil
 }

+// backendDownloadOptions translates the gallery's verification policy into
+// downloader options, and gates the call on strict-integrity mode. Both
+// InstallBackend and UpgradeBackend MUST route their download through these
+// options — without them, the corresponding code path silently downloads
+// and activates unverified backend bytes even when the gallery has a
+// verification: policy configured.
+//
+// For OCI URIs with a verification policy, returns a slice containing
+// downloader.WithImageVerifier(v) — the downloader will then run cosign
+// signature verification between fetching the manifest and extracting
+// layers (see pkg/downloader/uri.go OCI branch).
+//
+// For OCI URIs without a verification policy, or non-OCI URIs without a
+// SHA256, the function either returns a non-fatal warning (requireIntegrity
+// false) or fails the install (requireIntegrity true).
+func backendDownloadOptions(config *GalleryBackend, requireIntegrity bool) ([]downloader.DownloadOption, error) {
+	uri := downloader.URI(config.URI)
+	hasVerification := config.Gallery.Verification != nil
+	hasSHA := config.SHA256 != ""
+
+	switch {
+	case uri.LooksLikeOCI():
+		if !hasVerification {
+			if requireIntegrity {
+				return nil, fmt.Errorf("strict integrity: gallery %q has no verification policy for OCI backend %q (set verification: in the gallery YAML or disable --require-backend-integrity)",
+					config.Gallery.Name, config.Name)
+			}
+			xlog.Warn("installing OCI backend without signature verification",
+				"backend", config.Name, "gallery", config.Gallery.Name, "uri", config.URI)
+			return nil, nil
+		}
+		v, err := newGalleryVerifier(config.Gallery.Verification)
+		if err != nil {
+			return nil, fmt.Errorf("gallery %q verification policy: %w", config.Gallery.Name, err)
+		}
+		return []downloader.DownloadOption{downloader.WithImageVerifier(v)}, nil
+
+	case uri.LooksLikeDir():
+		// Local directory — out of scope for integrity checks.
+		return nil, nil
+
+	default:
+		if !hasSHA && requireIntegrity {
+			return nil, fmt.Errorf("strict integrity: backend %q has no SHA256 (gallery %q)",
+				config.Name, config.Gallery.Name)
+		}
+		// Non-strict: pkg/downloader already emits a warning when sha is empty.
+		return nil, nil
+	}
+}
+
+// newGalleryVerifier constructs a cosignverify.Verifier from the gallery
+// policy. Parses NotBefore (RFC3339) here so YAML errors surface at install
+// time rather than during signature verification.
+func newGalleryVerifier(p *config.GalleryVerification) (*cosignverify.Verifier, error) {
+	pol := cosignverify.Policy{
+		Issuer:        p.Issuer,
+		IssuerRegex:   p.IssuerRegex,
+		Identity:      p.Identity,
+		IdentityRegex: p.IdentityRegex,
+	}
+	if p.NotBefore != "" {
+		t, err := time.Parse(time.RFC3339, p.NotBefore)
+		if err != nil {
+			return nil, fmt.Errorf("not_before %q: %w", p.NotBefore, err)
+		}
+		pol.NotBefore = t
+	}
+	return cosignverify.NewVerifier(pol, nil, nil)
+}
+
 // InstallBackendFromGallery installs a backend from the gallery.
-func InstallBackendFromGallery(ctx context.Context, galleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, name string, downloadStatus func(string, string, string, float64), force bool) error {
+// requireIntegrity escalates a missing SHA256 / verification policy from a
+// warning to a hard failure (see backendDownloadOptions).
+func InstallBackendFromGallery(ctx context.Context, galleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, name string, downloadStatus func(string, string, string, float64), force, requireIntegrity bool) error {
 	if !force {
 		// check if we already have the backend installed
 		backends, err := ListSystemBackends(systemState)
@@ -149,7 +223,7 @@ func InstallBackendFromGallery(ctx context.Context, galleries []config.Gallery,
 		xlog.Debug("Installing backend from meta backend", "name", name, "bestBackend", bestBackend.Name)

 		// Then, let's install the best backend
-		if err := InstallBackend(ctx, systemState, modelLoader, bestBackend, downloadStatus); err != nil {
+		if err := InstallBackend(ctx, systemState, modelLoader, bestBackend, downloadStatus, requireIntegrity); err != nil {
 			return err
 		}

@@ -175,10 +249,10 @@ func InstallBackendFromGallery(ctx context.Context, galleries []config.Gallery,
 		return nil
 	}

-	return InstallBackend(ctx, systemState, modelLoader, backend, downloadStatus)
+	return InstallBackend(ctx, systemState, modelLoader, backend, downloadStatus, requireIntegrity)
 }

-func InstallBackend(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, config *GalleryBackend, downloadStatus func(string, string, string, float64)) error {
+func InstallBackend(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, config *GalleryBackend, downloadStatus func(string, string, string, float64), requireIntegrity bool) error {
 	// Get configurable fallback tag values from SystemState
 	latestTag, masterTag, devSuffix := getFallbackTagValues(systemState)

@@ -213,6 +287,14 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
 		return fmt.Errorf("failed to create base path: %v", err)
 	}

+	// Build the download options once and reuse for every retry path —
+	// mirrors and tag fallbacks must verify against the same gallery
+	// policy or we open a hole where a non-default URI bypasses the check.
+	downloadOpts, optsErr := backendDownloadOptions(config, requireIntegrity)
+	if optsErr != nil {
+		return fmt.Errorf("backend %q: %w", config.Name, optsErr)
+	}
+
 	uri := downloader.URI(config.URI)
 	// Check if it is a directory
 	if uri.LooksLikeDir() {
@@ -222,7 +304,7 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
 		}
 	} else {
 		xlog.Debug("Downloading backend", "uri", config.URI, "backendPath", backendPath)
-		if err := uri.DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err != nil {
+		if err := uri.DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus, downloadOpts...); err != nil {
 			xlog.Debug("Backend download failed, trying fallback", "backendPath", backendPath, "error", err)

 			// resetBackendPath cleans up partial state from a failed OCI extraction
@@ -243,7 +325,7 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
 				default:
 				}
 				resetBackendPath()
-				if err := downloader.URI(mirror).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
+				if err := downloader.URI(mirror).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus, downloadOpts...); err == nil {
 					success = true
 					xlog.Debug("Downloaded backend from mirror", "uri", config.URI, "backendPath", backendPath)
 					break
@@ -256,7 +338,7 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
 				if fallbackURI != string(config.URI) {
 					resetBackendPath()
 					xlog.Info("Trying fallback URI", "original", config.URI, "fallback", fallbackURI)
-					if err := downloader.URI(fallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
+					if err := downloader.URI(fallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus, downloadOpts...); err == nil {
 						xlog.Info("Downloaded backend using fallback URI", "uri", fallbackURI, "backendPath", backendPath)
 						success = true
 					} else {
@@ -265,7 +347,7 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
 							resetBackendPath()
 							devFallbackURI := fallbackURI + "-" + devSuffix
 							xlog.Info("Trying development fallback URI", "fallback", devFallbackURI)
-							if err := downloader.URI(devFallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
+							if err := downloader.URI(devFallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus, downloadOpts...); err == nil {
 								xlog.Info("Downloaded backend using development fallback URI", "uri", devFallbackURI, "backendPath", backendPath)
 								success = true
 							} else {
--- a/core/gallery/backends_test.go
+++ b/core/gallery/backends_test.go
@@ -117,13 +117,13 @@ var _ = Describe("Gallery Backends", func() {

 	Describe("InstallBackendFromGallery", func() {
 		It("should return error when backend is not found", func() {
-			err := InstallBackendFromGallery(context.TODO(), galleries, systemState, ml, "non-existent", nil, true)
+			err := InstallBackendFromGallery(context.TODO(), galleries, systemState, ml, "non-existent", nil, true, false)
 			Expect(err).To(HaveOccurred())
 			Expect(err.Error()).To(ContainSubstring("no backend found with name \"non-existent\""))
 		})

 		It("should install backend from gallery", func() {
-			err := InstallBackendFromGallery(context.TODO(), galleries, systemState, ml, "test-backend", nil, true)
+			err := InstallBackendFromGallery(context.TODO(), galleries, systemState, ml, "test-backend", nil, true, false)
 			Expect(err).ToNot(HaveOccurred())
 			Expect(filepath.Join(tempDir, "test-backend", "run.sh")).To(BeARegularFile())
 		})
@@ -545,7 +545,7 @@ var _ = Describe("Gallery Backends", func() {
 				VRAM:      1000000000000,
 				Backend:   system.Backend{BackendsPath: tempDir},
 			}
-			err = InstallBackendFromGallery(context.TODO(), []config.Gallery{gallery}, nvidiaSystemState, ml, "meta-backend", nil, true)
+			err = InstallBackendFromGallery(context.TODO(), []config.Gallery{gallery}, nvidiaSystemState, ml, "meta-backend", nil, true, false)
 			Expect(err).NotTo(HaveOccurred())

 			metaBackendPath := filepath.Join(tempDir, "meta-backend")
@@ -625,7 +625,7 @@ var _ = Describe("Gallery Backends", func() {
 				VRAM:      1000000000000,
 				Backend:   system.Backend{BackendsPath: tempDir},
 			}
-			err = InstallBackendFromGallery(context.TODO(), []config.Gallery{gallery}, nvidiaSystemState, ml, "meta-backend", nil, true)
+			err = InstallBackendFromGallery(context.TODO(), []config.Gallery{gallery}, nvidiaSystemState, ml, "meta-backend", nil, true, false)
 			Expect(err).NotTo(HaveOccurred())

 			metaBackendPath := filepath.Join(tempDir, "meta-backend")
@@ -709,7 +709,7 @@ var _ = Describe("Gallery Backends", func() {
 				VRAM:      1000000000000,
 				Backend:   system.Backend{BackendsPath: tempDir},
 			}
-			err = InstallBackendFromGallery(context.TODO(), []config.Gallery{gallery}, nvidiaSystemState, ml, "meta-backend", nil, true)
+			err = InstallBackendFromGallery(context.TODO(), []config.Gallery{gallery}, nvidiaSystemState, ml, "meta-backend", nil, true, false)
 			Expect(err).NotTo(HaveOccurred())

 			metaBackendPath := filepath.Join(tempDir, "meta-backend")
@@ -808,7 +808,7 @@ var _ = Describe("Gallery Backends", func() {
 				system.WithBackendPath(newPath),
 			)
 			Expect(err).NotTo(HaveOccurred())
-			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil)
+			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil, false)
 			Expect(newPath).To(BeADirectory())
 			Expect(err).To(HaveOccurred()) // Will fail due to invalid URI, but path should be created
 		})
@@ -840,7 +840,7 @@ var _ = Describe("Gallery Backends", func() {
 				system.WithBackendPath(tempDir),
 			)
 			Expect(err).NotTo(HaveOccurred())
-			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil)
+			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil, false)
 			Expect(err).ToNot(HaveOccurred())
 			Expect(filepath.Join(tempDir, "test-backend", "metadata.json")).To(BeARegularFile())
 			dat, err := os.ReadFile(filepath.Join(tempDir, "test-backend", "metadata.json"))
@@ -873,7 +873,7 @@ var _ = Describe("Gallery Backends", func() {

 			Expect(filepath.Join(tempDir, "test-backend", "metadata.json")).ToNot(BeARegularFile())

-			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil)
+			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil, false)
 			Expect(err).ToNot(HaveOccurred())
 			Expect(filepath.Join(tempDir, "test-backend", "metadata.json")).To(BeARegularFile())
 		})
@@ -894,7 +894,7 @@ var _ = Describe("Gallery Backends", func() {
 				system.WithBackendPath(tempDir),
 			)
 			Expect(err).NotTo(HaveOccurred())
-			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil)
+			err = InstallBackend(context.TODO(), systemState, ml, &backend, nil, false)
 			Expect(err).ToNot(HaveOccurred())
 			Expect(filepath.Join(tempDir, "test-backend", "metadata.json")).To(BeARegularFile())

--- a/core/gallery/backends_version_test.go
+++ b/core/gallery/backends_version_test.go
@@ -47,7 +47,7 @@ var _ = Describe("Backend versioning", func() {
 		backend.URI = srcDir
 		backend.Version = "1.2.3"

-		err = gallery.InstallBackend(context.Background(), systemState, modelLoader, backend, nil)
+		err = gallery.InstallBackend(context.Background(), systemState, modelLoader, backend, nil, false)
 		Expect(err).NotTo(HaveOccurred())

 		// Read the metadata file and check version
@@ -74,7 +74,7 @@ var _ = Describe("Backend versioning", func() {
 		backend.URI = srcDir
 		backend.Version = "2.0.0"

-		err = gallery.InstallBackend(context.Background(), systemState, modelLoader, backend, nil)
+		err = gallery.InstallBackend(context.Background(), systemState, modelLoader, backend, nil, false)
 		Expect(err).NotTo(HaveOccurred())

 		metadataPath := filepath.Join(tempDir, "test-backend-uri", "metadata.json")
@@ -100,7 +100,7 @@ var _ = Describe("Backend versioning", func() {
 		backend.URI = srcDir
 		// Version intentionally left empty

-		err = gallery.InstallBackend(context.Background(), systemState, modelLoader, backend, nil)
+		err = gallery.InstallBackend(context.Background(), systemState, modelLoader, backend, nil, false)
 		Expect(err).NotTo(HaveOccurred())

 		metadataPath := filepath.Join(tempDir, "test-backend-noversion", "metadata.json")
--- a/core/gallery/importers/llama-cpp.go
+++ b/core/gallery/importers/llama-cpp.go
@@ -1,10 +1,13 @@
 package importers

 import (
+	"context"
 	"encoding/json"
 	"path/filepath"
 	"strings"
+	"time"

+	gguf "github.com/gpustack/gguf-parser-go"
 	"github.com/mudler/LocalAI/core/config"
 	"github.com/mudler/LocalAI/core/gallery"
 	"github.com/mudler/LocalAI/core/schema"
@@ -261,6 +264,13 @@ func (i *LlamaCPPImporter) Import(details Details) (gallery.ModelConfig, error)
 	// Apply per-model-family inference parameter defaults
 	config.ApplyInferenceDefaults(&modelConfig, details.URI)

+	// Auto-detect Multi-Token Prediction heads (ggml-org/llama.cpp#22673) and
+	// enable speculative decoding. Mirrors the load-time hook so freshly
+	// imported configs already carry spec_type:draft-mtp before the model is
+	// ever loaded - users see it in the YAML preview rather than discovering
+	// it after the first start.
+	maybeApplyMTPDefaults(&modelConfig, details, &cfg)
+
 	data, err := yaml.Marshal(modelConfig)
 	if err != nil {
 		return gallery.ModelConfig{}, err
@@ -291,6 +301,85 @@ func pickPreferredGroup(groups []hfapi.ShardGroup, prefs []string) *hfapi.ShardG
 	return &groups[len(groups)-1]
 }

+// maybeApplyMTPDefaults parses the picked GGUF header (range-fetched over
+// HTTP for HF/URL imports) and, if the file declares a Multi-Token Prediction
+// head, appends the auto-MTP option keys to modelConfig.Options. Failures
+// during the probe are non-fatal: the importer keeps the config without MTP
+// so an unrelated network blip or weird header doesn't break the import.
+//
+// OCI/Ollama URIs are skipped because the artifact isn't directly fetchable
+// as a GGUF byte stream - the load-time hook (core/config/gguf.go) covers
+// those once the model is materialised on disk.
+func maybeApplyMTPDefaults(modelConfig *config.ModelConfig, details Details, cfg *gallery.ModelConfig) {
+	probeURL := pickMTPProbeURL(details, cfg)
+	if probeURL == "" {
+		return
+	}
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	defer func() {
+		if r := recover(); r != nil {
+			xlog.Debug("[mtp-importer] panic while probing GGUF header", "uri", probeURL, "recover", r)
+		}
+	}()
+
+	f, err := gguf.ParseGGUFFileRemote(ctx, probeURL)
+	if err != nil {
+		xlog.Debug("[mtp-importer] failed to read remote GGUF header for MTP detection", "uri", probeURL, "error", err)
+		return
+	}
+
+	n, ok := config.HasEmbeddedMTPHead(f)
+	if !ok {
+		return
+	}
+	config.ApplyMTPDefaults(modelConfig, n)
+}
+
+// pickMTPProbeURL returns an HTTP(S) URL pointing at the main (non-mmproj)
+// GGUF shard that should be inspected for an MTP head, or "" when no
+// suitable URL is available. Custom URI schemes (`huggingface://`,
+// `ollama://`, etc.) are run through `downloader.URI.ResolveURL` so the
+// resulting URL is something `gguf.ParseGGUFFileRemote` can actually open.
+// OCI/Ollama URIs are skipped because the artifact is not directly
+// streamable as a GGUF byte range.
+func pickMTPProbeURL(details Details, cfg *gallery.ModelConfig) string {
+	uri := downloader.URI(details.URI)
+
+	if uri.LooksLikeOCI() {
+		return ""
+	}
+
+	if strings.HasSuffix(strings.ToLower(details.URI), ".gguf") {
+		return resolveHTTPProbe(details.URI)
+	}
+
+	for _, f := range cfg.Files {
+		lower := strings.ToLower(f.Filename)
+		if strings.Contains(lower, "mmproj") {
+			continue
+		}
+		if !strings.HasSuffix(lower, ".gguf") {
+			continue
+		}
+		return resolveHTTPProbe(f.URI)
+	}
+	return ""
+}
+
+// resolveHTTPProbe resolves an importer-side URI to the HTTP(S) URL that
+// `gguf.ParseGGUFFileRemote` can range-fetch. Returns "" if the URI can't
+// be reduced to an HTTP(S) endpoint (e.g. local path, unsupported scheme).
+func resolveHTTPProbe(uri string) string {
+	resolved := downloader.URI(uri).ResolveURL()
+	if downloader.URI(resolved).LooksLikeHTTPURL() {
+		return resolved
+	}
+	return ""
+}
+
 // appendShardGroup copies every shard of group into cfg.Files under dest,
 // skipping any entry whose target filename is already present so repeated
 // calls (e.g. the rare case of mmproj + model picking the same group)
--- a/core/gallery/models.go
+++ b/core/gallery/models.go
@@ -77,7 +77,7 @@ func InstallModelFromGallery(
 	modelGalleries, backendGalleries []lconfig.Gallery,
 	systemState *system.SystemState,
 	modelLoader *model.ModelLoader,
-	name string, req GalleryModel, downloadStatus func(string, string, string, float64), enforceScan, automaticallyInstallBackend bool) error {
+	name string, req GalleryModel, downloadStatus func(string, string, string, float64), enforceScan, automaticallyInstallBackend, requireBackendIntegrity bool) error {

 	applyModel := func(model *GalleryModel) error {
 		name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
@@ -137,7 +137,7 @@ func InstallModelFromGallery(
 		if automaticallyInstallBackend && installedModel.Backend != "" {
 			xlog.Debug("Installing backend", "backend", installedModel.Backend)

-			if err := InstallBackendFromGallery(ctx, backendGalleries, systemState, modelLoader, installedModel.Backend, downloadStatus, false); err != nil {
+			if err := InstallBackendFromGallery(ctx, backendGalleries, systemState, modelLoader, installedModel.Backend, downloadStatus, false, requireBackendIntegrity); err != nil {
 				return err
 			}
 		}
--- a/core/gallery/models_test.go
+++ b/core/gallery/models_test.go
@@ -89,7 +89,7 @@ var _ = Describe("Model test", func() {
 			Expect(models[0].URL).To(Equal(bertEmbeddingsURL))
 			Expect(models[0].Installed).To(BeFalse())

-			err = InstallModelFromGallery(context.TODO(), galleries, []config.Gallery{}, systemState, nil, "test@bert", GalleryModel{}, func(s1, s2, s3 string, f float64) {}, true, true)
+			err = InstallModelFromGallery(context.TODO(), galleries, []config.Gallery{}, systemState, nil, "test@bert", GalleryModel{}, func(s1, s2, s3 string, f float64) {}, true, true, false)
 			Expect(err).ToNot(HaveOccurred())

 			dat, err := os.ReadFile(filepath.Join(tempdir, "bert.yaml"))
--- a/core/gallery/upgrade.go
+++ b/core/gallery/upgrade.go
@@ -232,7 +232,7 @@ func summarizeNodeDrift(nodes []NodeBackendRef) (majority struct{ version, diges

 // UpgradeBackend upgrades a single backend to the latest gallery version using
 // an atomic swap with backup-based rollback on failure.
-func UpgradeBackend(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, galleries []config.Gallery, backendName string, downloadStatus func(string, string, string, float64)) error {
+func UpgradeBackend(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, galleries []config.Gallery, backendName string, downloadStatus func(string, string, string, float64), requireIntegrity bool) error {
 	// Look up the installed backend
 	installedBackends, err := ListSystemBackends(systemState)
 	if err != nil {
@@ -251,7 +251,7 @@ func UpgradeBackend(ctx context.Context, systemState *system.SystemState, modelL
 	// If this is a meta backend, recursively upgrade the concrete backend it points to
 	if installed.Metadata != nil && installed.Metadata.MetaBackendFor != "" {
 		xlog.Info("Meta backend detected, upgrading concrete backend", "meta", backendName, "concrete", installed.Metadata.MetaBackendFor)
-		return UpgradeBackend(ctx, systemState, modelLoader, galleries, installed.Metadata.MetaBackendFor, downloadStatus)
+		return UpgradeBackend(ctx, systemState, modelLoader, galleries, installed.Metadata.MetaBackendFor, downloadStatus, requireIntegrity)
 	}

 	// Find the gallery entry
@@ -265,6 +265,16 @@ func UpgradeBackend(ctx context.Context, systemState *system.SystemState, modelL
 		return fmt.Errorf("no gallery entry found for backend %q", backendName)
 	}

+	// Resolve integrity options (cosign verifier for OCI URIs, strict-mode
+	// gate for missing SHA256/policy) BEFORE writing anything to disk.
+	// Without this, the upgrade path would atomically swap in an
+	// unverified backend even when the gallery has a verification policy
+	// — see backendDownloadOptions in backends.go.
+	downloadOpts, err := backendDownloadOptions(galleryEntry, requireIntegrity)
+	if err != nil {
+		return fmt.Errorf("upgrade %q: %w", backendName, err)
+	}
+
 	backendPath := filepath.Join(systemState.Backend.BackendsPath, backendName)
 	tmpPath := backendPath + ".upgrade-tmp"
 	backupPath := backendPath + ".backup"
@@ -285,7 +295,7 @@ func UpgradeBackend(ctx context.Context, systemState *system.SystemState, modelL
 			return fmt.Errorf("failed to copy backend from directory: %w", err)
 		}
 	} else {
-		if err := uri.DownloadFileWithContext(ctx, tmpPath, "", 1, 1, downloadStatus); err != nil {
+		if err := uri.DownloadFileWithContext(ctx, tmpPath, galleryEntry.SHA256, 1, 1, downloadStatus, downloadOpts...); err != nil {
 			os.RemoveAll(tmpPath)
 			return fmt.Errorf("failed to download backend: %w", err)
 		}
--- a/core/gallery/upgrade_test.go
+++ b/core/gallery/upgrade_test.go
@@ -383,7 +383,7 @@ var _ = Describe("Upgrade Detection and Execution", func() {
 			})

 			ml := model.NewModelLoader(systemState)
-			err := UpgradeBackend(context.Background(), systemState, ml, galleries, "my-backend", nil)
+			err := UpgradeBackend(context.Background(), systemState, ml, galleries, "my-backend", nil, false)
 			Expect(err).NotTo(HaveOccurred())

 			// Verify run.sh was updated
@@ -417,7 +417,7 @@ var _ = Describe("Upgrade Detection and Execution", func() {
 			})

 			ml := model.NewModelLoader(systemState)
-			err := UpgradeBackend(context.Background(), systemState, ml, galleries, "my-backend", nil)
+			err := UpgradeBackend(context.Background(), systemState, ml, galleries, "my-backend", nil, false)
 			Expect(err).To(HaveOccurred())

 			// Verify v1 is still intact
@@ -432,5 +432,41 @@ var _ = Describe("Upgrade Detection and Execution", func() {
 			Expect(json.Unmarshal(metaData, &meta)).To(Succeed())
 			Expect(meta.Version).To(Equal("1.0.0"))
 		})
+
+		// Regression: an earlier version of UpgradeBackend wrote the
+		// downloaded bytes to disk without going through
+		// backendDownloadOptions, so the gallery's verification policy
+		// (and strict-integrity gate) didn't apply on upgrade. This test
+		// pins the upgrade path to the same integrity gate as installs:
+		// strict mode + an OCI URI without a verification: block must
+		// hard-fail *before* anything is downloaded or swapped in.
+		It("should refuse to upgrade an OCI backend that bypasses integrity in strict mode", func() {
+			installBackendWithVersion("my-backend", "1.0.0", "#!/bin/sh\necho v1")
+
+			// OCI URI, no Gallery.Verification → backendDownloadOptions
+			// returns a strict-integrity error before any network call.
+			writeGalleryYAML([]GalleryBackend{
+				{
+					Metadata: Metadata{
+						Name: "my-backend",
+					},
+					URI:     "oci://example.invalid/missing:never-fetched",
+					Version: "2.0.0",
+				},
+			})
+
+			ml := model.NewModelLoader(systemState)
+			err := UpgradeBackend(context.Background(), systemState, ml, galleries, "my-backend", nil, true)
+			Expect(err).To(HaveOccurred())
+			Expect(err.Error()).To(ContainSubstring("strict integrity"))
+
+			// The installed v1 must be untouched — the upgrade should
+			// have aborted before writing anything.
+			content, err := os.ReadFile(filepath.Join(backendsPath, "my-backend", "run.sh"))
+			Expect(err).NotTo(HaveOccurred())
+			Expect(string(content)).To(Equal("#!/bin/sh\necho v1"))
+			Expect(filepath.Join(backendsPath, "my-backend.upgrade-tmp")).NotTo(BeAnExistingFile())
+			Expect(filepath.Join(backendsPath, "my-backend.backup")).NotTo(BeAnExistingFile())
+		})
 	})
 })
--- a/core/http/endpoints/localai/video.go
+++ b/core/http/endpoints/localai/video.go
@@ -22,12 +22,19 @@ import (
 	"github.com/mudler/LocalAI/core/backend"

 	model "github.com/mudler/LocalAI/pkg/model"
+	"github.com/mudler/LocalAI/pkg/utils"
 	"github.com/mudler/xlog"
 )

+var videoDownloadClient = http.Client{Timeout: 30 * time.Second}
+
 func downloadFile(url string) (string, error) {
+	if err := utils.ValidateExternalURL(url); err != nil {
+		return "", fmt.Errorf("URL validation failed: %w", err)
+	}
+
 	// Get the data
-	resp, err := http.Get(url)
+	resp, err := videoDownloadClient.Get(url)
 	if err != nil {
 		return "", err
 	}
--- a/core/http/endpoints/openai/chat.go
+++ b/core/http/endpoints/openai/chat.go
@@ -131,13 +131,19 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 				delta.Reasoning = &reasoningDelta
 			}

+			// Usage rides as a struct field for the consumer to track the
+			// running cumulative — it is stripped before JSON marshal so the
+			// wire chunk stays spec-compliant (no `usage` on intermediate
+			// chunks). The dedicated trailer chunk (when include_usage=true)
+			// carries the final totals.
+			usageForChunk := usage
 			resp := schema.OpenAIResponse{
 				ID:      id,
 				Created: created,
 				Model:   req.Model, // we have to return what the user sent here, due to OpenAI spec.
 				Choices: []schema.Choice{{Delta: delta, Index: 0, FinishReason: nil}},
 				Object:  "chat.completion.chunk",
-				Usage:   usage,
+				Usage:   &usageForChunk,
 			}

 			responses <- resp
@@ -164,7 +170,7 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 		hasChatDeltaToolCalls := false
 		hasChatDeltaContent := false

-		_, tokenUsage, chatDeltas, err := ComputeChoices(req, prompt, config, cl, startupOptions, loader, func(s string, c *[]schema.Choice) {}, func(s string, usage backend.TokenUsage) bool {
+		_, _, chatDeltas, err := ComputeChoices(req, prompt, config, cl, startupOptions, loader, func(s string, c *[]schema.Choice) {}, func(s string, usage backend.TokenUsage) bool {
 			result += s

 			// Track whether ChatDeltas from the C++ autoparser contain
@@ -387,16 +393,11 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator

 		switch {
 		case noActionToRun:
-			usage := schema.OpenAIUsage{
-				PromptTokens:     tokenUsage.Prompt,
-				CompletionTokens: tokenUsage.Completion,
-				TotalTokens:      tokenUsage.Prompt + tokenUsage.Completion,
-			}
-			if extraUsage {
-				usage.TimingTokenGeneration = tokenUsage.TimingTokenGeneration
-				usage.TimingPromptProcessing = tokenUsage.TimingPromptProcessing
-			}
-
+			// Token-cumulative usage is communicated to the streaming
+			// consumer via the per-token callback's chunk struct (stripped
+			// before wire marshal). The final usage trailer — when the
+			// caller opted in with stream_options.include_usage — is built
+			// by the outer streaming loop, not here.
 			var result string
 			if !sentInitialRole {
 				var hqErr error
@@ -409,7 +410,7 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 			for _, chunk := range buildNoActionFinalChunks(
 				id, req.Model, created,
 				sentInitialRole, sentReasoning,
-				result, reasoning, usage,
+				result, reasoning,
 			) {
 				responses <- chunk
 			}
@@ -724,7 +725,13 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 							xlog.Debug("No choices in the response, skipping")
 							continue
 						}
-						usage = &ev.Usage // Copy a pointer to the latest usage chunk so that the stop message can reference it
+						// Capture the running cumulative usage from this chunk
+						// (when present) so the include_usage trailer can carry
+						// the final totals. Usage is stripped before marshal
+						// below so the wire chunk stays spec-compliant.
+						if ev.Usage != nil {
+							usage = ev.Usage
+						}
 						if len(ev.Choices[0].Delta.ToolCalls) > 0 {
 							toolsCalled = true
 							// Collect and merge tool call deltas for MCP execution
@@ -740,6 +747,11 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 								collectedContent += *sp
 							}
 						}
+						// OpenAI streaming spec: intermediate chunks must NOT
+						// carry a `usage` field. Strip the tracking copy
+						// before marshalling — usage is delivered via the
+						// dedicated trailer chunk when include_usage=true.
+						ev.Usage = nil
 						respData, err := json.Marshal(ev)
 						if err != nil {
 							xlog.Debug("Failed to marshal response", "error", err)
@@ -888,6 +900,9 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 					finishReason = FinishReasonFunctionCall
 				}

+				// Final delta chunk: empty delta with finish_reason set. Per
+				// OpenAI streaming spec this chunk does NOT carry usage —
+				// the optional trailer (below) does, gated on include_usage.
 				resp := &schema.OpenAIResponse{
 					ID:      id,
 					Created: created,
@@ -899,11 +914,18 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 							Delta:        &schema.Message{},
 						}},
 					Object: "chat.completion.chunk",
-					Usage:  *usage,
 				}
 				respData, _ := json.Marshal(resp)
-
 				fmt.Fprintf(c.Response().Writer, "data: %s\n\n", respData)
+
+				// Trailing usage chunk per OpenAI spec: emit only when the
+				// caller opted in via stream_options.include_usage. Shape:
+				// {"choices":[],"usage":{...},"object":"chat.completion.chunk",...}
+				if input.StreamOptions != nil && input.StreamOptions.IncludeUsage && usage != nil {
+					trailer := streamUsageTrailerJSON(id, input.Model, created, *usage)
+					_, _ = fmt.Fprintf(c.Response().Writer, "data: %s\n\n", trailer)
+				}
+
 				fmt.Fprintf(c.Response().Writer, "data: [DONE]\n\n")
 				c.Response().Flush()
 				xlog.Debug("Stream ended")
@@ -1263,7 +1285,7 @@ func ChatEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 					Model:   input.Model, // we have to return what the user sent here, due to OpenAI spec.
 					Choices: result,
 					Object:  "chat.completion",
-					Usage:   usage,
+					Usage:   &usage,
 				}
 				respData, _ := json.Marshal(resp)
 				xlog.Debug("Response", "response", string(respData))
--- a/core/http/endpoints/openai/chat_emit.go
+++ b/core/http/endpoints/openai/chat_emit.go
@@ -1,12 +1,45 @@
 package openai

 import (
+	"encoding/json"
 	"fmt"

 	"github.com/mudler/LocalAI/core/schema"
 	"github.com/mudler/LocalAI/pkg/functions"
 )

+// streamUsageTrailerJSON returns the bytes of the OpenAI-spec trailing usage
+// chunk emitted in streaming completions when the request opts in via
+// `stream_options.include_usage: true`. The shape is:
+//
+//	{"id":"...","object":"chat.completion.chunk","created":N,
+//	 "model":"...","choices":[],"usage":{...}}
+//
+// `choices` is intentionally an empty array (not absent, not null) — that is
+// what the OpenAI spec mandates, and what consumers like the official OpenAI
+// SDK and Continue's openai-adapter look for to recognise this as the usage
+// chunk rather than a content chunk. schema.OpenAIResponse has `omitempty`
+// on Choices, so we cannot reuse it for the trailer.
+func streamUsageTrailerJSON(id, model string, created int, usage schema.OpenAIUsage) []byte {
+	trailer := struct {
+		ID      string             `json:"id"`
+		Created int                `json:"created"`
+		Model   string             `json:"model"`
+		Object  string             `json:"object"`
+		Choices []schema.Choice    `json:"choices"`
+		Usage   schema.OpenAIUsage `json:"usage"`
+	}{
+		ID:      id,
+		Created: created,
+		Model:   model,
+		Object:  "chat.completion.chunk",
+		Choices: []schema.Choice{},
+		Usage:   usage,
+	}
+	b, _ := json.Marshal(trailer)
+	return b
+}
+
 // hasRealCall reports whether functionResults contains at least one
 // entry whose Name is something other than the noAction sentinel.
 // Used by processTools to decide between the "answer the question"
@@ -25,10 +58,10 @@ func hasRealCall(functionResults []functions.FuncCallResults, noAction string) b
 // pseudo-function or emitted no tool calls at all).
 //
 // When content was already streamed (contentAlreadyStreamed=true) the
-// helper emits a single trailing usage chunk, optionally carrying
-// reasoning that was produced but not streamed incrementally. When
-// content was not streamed it emits a role chunk followed by a
-// content+reasoning+usage chunk — the "send everything at once" fallback.
+// helper emits a trailing reasoning chunk if any non-streamed reasoning
+// remains, else nothing. When content was not streamed it emits a role
+// chunk followed by a content (+reasoning) chunk — the "send everything
+// at once" fallback.
 //
 // Reasoning re-emission is guarded by reasoningAlreadyStreamed, not by
 // probing the extractor's Go-side state: the C++ autoparser delivers
@@ -36,6 +69,10 @@ func hasRealCall(functionResults []functions.FuncCallResults, noAction string) b
 // separate accumulator that extractor.Reasoning() does not expose.
 // Without this guard the callback would stream reasoning incrementally
 // and the final chunk would duplicate it.
+//
+// The returned chunks intentionally do NOT carry a `usage` field. The
+// usage trailer is emitted separately by the streaming handler when
+// `stream_options.include_usage` is true, per OpenAI spec.
 func buildNoActionFinalChunks(
 	id, model string,
 	created int,
@@ -43,26 +80,26 @@ func buildNoActionFinalChunks(
 	reasoningAlreadyStreamed bool,
 	content string,
 	reasoning string,
-	usage schema.OpenAIUsage,
 ) []schema.OpenAIResponse {
 	var out []schema.OpenAIResponse

 	if contentAlreadyStreamed {
-		delta := &schema.Message{}
-		if reasoning != "" && !reasoningAlreadyStreamed {
-			r := reasoning
-			delta.Reasoning = &r
+		if reasoning == "" || reasoningAlreadyStreamed {
+			return nil
 		}
+		r := reasoning
 		out = append(out, schema.OpenAIResponse{
 			ID: id, Created: created, Model: model,
-			Choices: []schema.Choice{{Delta: delta, Index: 0}},
-			Object:  "chat.completion.chunk",
-			Usage:   usage,
+			Choices: []schema.Choice{{
+				Delta: &schema.Message{Reasoning: &r},
+				Index: 0,
+			}},
+			Object: "chat.completion.chunk",
 		})
 		return out
 	}

-	// Content was not streamed — send role, then content (+reasoning) + usage.
+	// Content was not streamed — send role, then content (+reasoning).
 	out = append(out, schema.OpenAIResponse{
 		ID: id, Created: created, Model: model,
 		Choices: []schema.Choice{{
@@ -82,7 +119,6 @@ func buildNoActionFinalChunks(
 		ID: id, Created: created, Model: model,
 		Choices: []schema.Choice{{Delta: delta, Index: 0}},
 		Object:  "chat.completion.chunk",
-		Usage:   usage,
 	})
 	return out
 }
--- a/core/http/endpoints/openai/chat_emit_test.go
+++ b/core/http/endpoints/openai/chat_emit_test.go
@@ -609,54 +609,52 @@ var _ = Describe("buildNoActionFinalChunks", func() {
 		testModel   = "test-model"
 		testCreated = 1700000000
 	)
-	usage := schema.OpenAIUsage{PromptTokens: 5, CompletionTokens: 7, TotalTokens: 12}

-	Describe("Content streamed — trailing usage chunk", func() {
-		It("emits just one chunk with usage, no content, no reasoning when reasoning was streamed", func() {
+	Describe("Content streamed — trailing reasoning only", func() {
+		It("emits nothing when content and reasoning were already streamed", func() {
+			// Before the streaming-usage-spec fix this branch emitted a
+			// content-less chunk solely to carry `usage`. Per the OpenAI
+			// spec usage no longer rides on delta chunks; the dedicated
+			// trailer (when include_usage=true) carries it instead — so
+			// with nothing to deliver the helper returns no chunks.
 			chunks := buildNoActionFinalChunks(
 				testID, testModel, testCreated,
 				true, true,
-				"", "already-streamed-reasoning", usage,
+				"", "already-streamed-reasoning",
 			)
-
-			Expect(chunks).To(HaveLen(1))
-			Expect(chunks[0].Usage.TotalTokens).To(Equal(12))
-			Expect(contentOf(chunks[0])).To(BeEmpty())
-			Expect(reasoningOf(chunks[0])).To(BeEmpty(),
-				"reasoning must not be re-emitted once it was streamed via the callback")
+			Expect(chunks).To(BeEmpty())
 		})

 		It("emits a trailing reasoning delivery when reasoning came only at end", func() {
 			chunks := buildNoActionFinalChunks(
 				testID, testModel, testCreated,
 				true, false,
-				"", "autoparser final reasoning", usage,
+				"", "autoparser final reasoning",
 			)

 			Expect(chunks).To(HaveLen(1))
 			Expect(reasoningOf(chunks[0])).To(Equal("autoparser final reasoning"))
 			Expect(contentOf(chunks[0])).To(BeEmpty())
-			Expect(chunks[0].Usage.TotalTokens).To(Equal(12))
+			Expect(chunks[0].Usage).To(BeNil(),
+				"intermediate chunks must not carry usage per OpenAI spec")
 		})

-		It("omits reasoning when it's empty regardless of streamed flag", func() {
+		It("returns no chunks when reasoning is empty and content was streamed", func() {
 			chunks := buildNoActionFinalChunks(
 				testID, testModel, testCreated,
 				true, false,
-				"", "", usage,
+				"", "",
 			)
-
-			Expect(chunks).To(HaveLen(1))
-			Expect(reasoningOf(chunks[0])).To(BeEmpty())
+			Expect(chunks).To(BeEmpty())
 		})
 	})

-	Describe("Content not streamed — role, then content+usage", func() {
+	Describe("Content not streamed — role, then content", func() {
 		It("emits role chunk then content chunk without reasoning when reasoning was streamed", func() {
 			chunks := buildNoActionFinalChunks(
 				testID, testModel, testCreated,
 				false, true,
-				"the answer", "already-streamed-reasoning", usage,
+				"the answer", "already-streamed-reasoning",
 			)

 			Expect(chunks).To(HaveLen(2))
@@ -666,14 +664,14 @@ var _ = Describe("buildNoActionFinalChunks", func() {
 			Expect(contentOf(chunks[1])).To(Equal("the answer"))
 			Expect(reasoningOf(chunks[1])).To(BeEmpty(),
 				"reasoning must not be re-emitted if it was streamed earlier")
-			Expect(chunks[1].Usage.TotalTokens).To(Equal(12))
+			Expect(chunks[1].Usage).To(BeNil())
 		})

 		It("emits role, then content+reasoning when reasoning was not streamed", func() {
 			chunks := buildNoActionFinalChunks(
 				testID, testModel, testCreated,
 				false, false,
-				"the answer", "autoparser final reasoning", usage,
+				"the answer", "autoparser final reasoning",
 			)

 			Expect(chunks).To(HaveLen(2))
@@ -681,14 +679,14 @@ var _ = Describe("buildNoActionFinalChunks", func() {

 			Expect(contentOf(chunks[1])).To(Equal("the answer"))
 			Expect(reasoningOf(chunks[1])).To(Equal("autoparser final reasoning"))
-			Expect(chunks[1].Usage.TotalTokens).To(Equal(12))
+			Expect(chunks[1].Usage).To(BeNil())
 		})

 		It("still emits content even when reasoning is empty", func() {
 			chunks := buildNoActionFinalChunks(
 				testID, testModel, testCreated,
 				false, false,
-				"just an answer", "", usage,
+				"just an answer", "",
 			)

 			Expect(chunks).To(HaveLen(2))
@@ -702,7 +700,7 @@ var _ = Describe("buildNoActionFinalChunks", func() {
 			chunks := buildNoActionFinalChunks(
 				testID, testModel, testCreated,
 				false, false,
-				"hi", "reasoning", usage,
+				"hi", "reasoning",
 			)
 			for i, ch := range chunks {
 				Expect(ch.ID).To(Equal(testID), "chunk[%d] ID", i)
--- a/core/http/endpoints/openai/chat_stream_usage_test.go
+++ b/core/http/endpoints/openai/chat_stream_usage_test.go
@@ -0,0 +1,179 @@
+package openai
+
+import (
+	"encoding/json"
+
+	"github.com/mudler/LocalAI/core/schema"
+	"github.com/mudler/LocalAI/pkg/functions"
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+// These tests pin LocalAI's streaming chunks to the OpenAI spec for the
+// `usage` field. The regression that motivated them (issue #8546) was that
+// LocalAI emitted `"usage":{...zeros...}` on every chunk, which made the
+// official OpenAI Node SDK consumers (Continue, Kilo Code, Roo Code, Zed,
+// IntelliJ Continue) drop every content chunk via the filter at
+// continuedev/continue packages/openai-adapters/src/apis/OpenAI.ts:275-288.
+//
+// Per OpenAI's chat-completion streaming contract:
+//   - intermediate chunks MUST NOT carry a `usage` field
+//   - usage is only delivered when the request opts in via
+//     `stream_options.include_usage: true`, on a final extra chunk whose
+//     `choices` is an empty array.
+
+var _ = Describe("streaming usage spec compliance", func() {
+	Describe("OpenAIResponse JSON shape", func() {
+		It("does not emit a 'usage' key when Usage is unset", func() {
+			// A typical intermediate token chunk: no Usage populated.
+			content := "hello"
+			resp := schema.OpenAIResponse{
+				ID:      "req-1",
+				Created: 1,
+				Model:   "m",
+				Object:  "chat.completion.chunk",
+				Choices: []schema.Choice{{
+					Index: 0,
+					Delta: &schema.Message{Content: &content},
+				}},
+			}
+			data, err := json.Marshal(resp)
+			Expect(err).ToNot(HaveOccurred())
+
+			var raw map[string]any
+			Expect(json.Unmarshal(data, &raw)).To(Succeed())
+			_, present := raw["usage"]
+			Expect(present).To(BeFalse(),
+				"intermediate chunk must not include a 'usage' key; got: %s", string(data))
+		})
+
+		It("emits the usage object when Usage is explicitly set", func() {
+			usage := &schema.OpenAIUsage{PromptTokens: 11, CompletionTokens: 22, TotalTokens: 33}
+			resp := schema.OpenAIResponse{
+				ID:      "req-1",
+				Created: 1,
+				Model:   "m",
+				Object:  "chat.completion.chunk",
+				Usage:   usage,
+			}
+			data, err := json.Marshal(resp)
+			Expect(err).ToNot(HaveOccurred())
+
+			var raw map[string]any
+			Expect(json.Unmarshal(data, &raw)).To(Succeed())
+			u, ok := raw["usage"].(map[string]any)
+			Expect(ok).To(BeTrue(), "expected 'usage' object, got: %s", string(data))
+			Expect(u["prompt_tokens"]).To(BeNumerically("==", 11))
+			Expect(u["completion_tokens"]).To(BeNumerically("==", 22))
+			Expect(u["total_tokens"]).To(BeNumerically("==", 33))
+		})
+	})
+
+	Describe("buildNoActionFinalChunks", func() {
+		It("returns chunks with no Usage embedded", func() {
+			// Whatever the caller is doing, helpers must not bake usage
+			// into intermediate or final delta chunks. The usage trailer
+			// (when requested via include_usage) is emitted separately.
+			chunks := buildNoActionFinalChunks(
+				"req-1", "m", 1,
+				false, false,
+				"hi", "",
+			)
+			Expect(chunks).ToNot(BeEmpty())
+			for i, ch := range chunks {
+				Expect(ch.Usage).To(BeNil(),
+					"chunk[%d] must not carry Usage; got %+v", i, ch.Usage)
+			}
+		})
+
+		It("returns chunks with no Usage when only trailing reasoning needs delivery", func() {
+			chunks := buildNoActionFinalChunks(
+				"req-1", "m", 1,
+				true, false,
+				"", "autoparser late reasoning",
+			)
+			Expect(chunks).ToNot(BeEmpty())
+			for i, ch := range chunks {
+				Expect(ch.Usage).To(BeNil(),
+					"chunk[%d] must not carry Usage; got %+v", i, ch.Usage)
+			}
+		})
+	})
+
+	Describe("buildDeferredToolCallChunks", func() {
+		It("returns chunks with no Usage embedded", func() {
+			calls := []functions.FuncCallResults{{
+				Name: "do_thing", Arguments: `{"x":1}`,
+			}}
+			chunks := buildDeferredToolCallChunks(
+				"req-1", "m", 1, calls, 0,
+				false, "", false, "",
+			)
+			Expect(chunks).ToNot(BeEmpty())
+			for i, ch := range chunks {
+				Expect(ch.Usage).To(BeNil(),
+					"chunk[%d] must not carry Usage; got %+v", i, ch.Usage)
+			}
+		})
+	})
+
+	Describe("streamUsageTrailerJSON", func() {
+		It("produces JSON matching the OpenAI spec for the trailer chunk", func() {
+			// Trailing usage chunk shape (OpenAI streaming spec):
+			//   {"id":"...","object":"chat.completion.chunk","created":...,
+			//    "model":"...","choices":[],"usage":{...}}
+			usage := schema.OpenAIUsage{
+				PromptTokens: 18, CompletionTokens: 14, TotalTokens: 32,
+			}
+			data := streamUsageTrailerJSON("req-1", "m", 1, usage)
+
+			var raw map[string]any
+			Expect(json.Unmarshal(data, &raw)).To(Succeed(),
+				"trailer must be valid JSON, got: %s", string(data))
+
+			Expect(raw["id"]).To(Equal("req-1"))
+			Expect(raw["model"]).To(Equal("m"))
+			Expect(raw["object"]).To(Equal("chat.completion.chunk"))
+			Expect(raw["created"]).To(BeNumerically("==", 1))
+
+			// `choices` MUST be present as an empty array (not absent, not null).
+			rawChoices, present := raw["choices"]
+			Expect(present).To(BeTrue(), "choices key must be present, got: %s", string(data))
+			choicesArr, ok := rawChoices.([]any)
+			Expect(ok).To(BeTrue(), "choices must serialize as an array, got: %s", string(data))
+			Expect(choicesArr).To(BeEmpty(), "choices must be empty in usage trailer, got: %s", string(data))
+
+			// `usage` MUST be present and non-null with the populated counts.
+			u, ok := raw["usage"].(map[string]any)
+			Expect(ok).To(BeTrue(), "usage object must be present, got: %s", string(data))
+			Expect(u["prompt_tokens"]).To(BeNumerically("==", 18))
+			Expect(u["completion_tokens"]).To(BeNumerically("==", 14))
+			Expect(u["total_tokens"]).To(BeNumerically("==", 32))
+		})
+	})
+
+	Describe("OpenAIRequest.StreamOptions", func() {
+		It("parses stream_options.include_usage=true", func() {
+			body := []byte(`{
+                "model": "m",
+                "stream": true,
+                "stream_options": {"include_usage": true},
+                "messages": []
+            }`)
+			var req schema.OpenAIRequest
+			Expect(json.Unmarshal(body, &req)).To(Succeed())
+			Expect(req.StreamOptions).ToNot(BeNil())
+			Expect(req.StreamOptions.IncludeUsage).To(BeTrue())
+		})
+
+		It("defaults IncludeUsage to false when stream_options is absent", func() {
+			body := []byte(`{"model":"m","stream":true,"messages":[]}`)
+			var req schema.OpenAIRequest
+			Expect(json.Unmarshal(body, &req)).To(Succeed())
+			// Either a nil StreamOptions or one with IncludeUsage=false is acceptable.
+			if req.StreamOptions != nil {
+				Expect(req.StreamOptions.IncludeUsage).To(BeFalse())
+			}
+		})
+	})
+})
--- a/core/http/endpoints/openai/completion.go
+++ b/core/http/endpoints/openai/completion.go
@@ -39,6 +39,10 @@ func CompletionEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, eva
 				usage.TimingTokenGeneration = tokenUsage.TimingTokenGeneration
 				usage.TimingPromptProcessing = tokenUsage.TimingPromptProcessing
 			}
+			// Usage rides on the struct for the consumer to track the
+			// running cumulative; the consumer strips it before marshalling
+			// so intermediate chunks stay OpenAI-spec compliant.
+			usageForChunk := usage
 			resp := schema.OpenAIResponse{
 				ID:      id,
 				Created: created,
@@ -51,7 +55,7 @@ func CompletionEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, eva
 					},
 				},
 				Object: "text_completion",
-				Usage:  usage,
+				Usage:  &usageForChunk,
 			}
 			xlog.Debug("Sending goroutine", "text", s)

@@ -127,6 +131,8 @@ func CompletionEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, eva
 				ended <- process(id, predInput, input, config, ml, responses, extraUsage)
 			}()

+			var latestUsage *schema.OpenAIUsage
+
 		LOOP:
 			for {
 				select {
@@ -135,6 +141,14 @@ func CompletionEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, eva
 						xlog.Debug("No choices in the response, skipping")
 						continue
 					}
+					// Capture running cumulative usage for the optional trailer
+					// emitted after the final stop chunk when include_usage=true.
+					if ev.Usage != nil {
+						latestUsage = ev.Usage
+					}
+					// OpenAI streaming spec: intermediate chunks must NOT
+					// carry a `usage` field. Strip the tracking copy now.
+					ev.Usage = nil
 					respData, err := json.Marshal(ev)
 					if err != nil {
 						xlog.Debug("Failed to marshal response", "error", err)
@@ -194,8 +208,15 @@ func CompletionEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, eva
 				Object: "text_completion",
 			}
 			respData, _ := json.Marshal(resp)
-
 			fmt.Fprintf(c.Response().Writer, "data: %s\n\n", respData)
+
+			// Trailing usage chunk per OpenAI spec: emit only when the caller
+			// opted in via stream_options.include_usage.
+			if input.StreamOptions != nil && input.StreamOptions.IncludeUsage && latestUsage != nil {
+				trailer := streamUsageTrailerJSON(id, input.Model, created, *latestUsage)
+				_, _ = fmt.Fprintf(c.Response().Writer, "data: %s\n\n", trailer)
+			}
+
 			fmt.Fprintf(c.Response().Writer, "data: [DONE]\n\n")
 			c.Response().Flush()
 			return nil
@@ -247,7 +268,7 @@ func CompletionEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, eva
 			Model:   input.Model, // we have to return what the user sent here, due to OpenAI spec.
 			Choices: result,
 			Object:  "text_completion",
-			Usage:   usage,
+			Usage:   &usage,
 		}

 		jsonResult, _ := json.Marshal(resp)
--- a/core/http/endpoints/openai/edit.go
+++ b/core/http/endpoints/openai/edit.go
@@ -92,7 +92,7 @@ func EditEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, evaluator
 			Model:   input.Model, // we have to return what the user sent here, due to OpenAI spec.
 			Choices: result,
 			Object:  "edit",
-			Usage:   usage,
+			Usage:   &usage,
 		}

 		jsonResult, _ := json.Marshal(resp)
--- a/core/http/endpoints/openai/image.go
+++ b/core/http/endpoints/openai/image.go
@@ -233,7 +233,7 @@ func ImageEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, appConfi
 			ID:      id,
 			Created: created,
 			Data:    result,
-			Usage: schema.OpenAIUsage{
+			Usage: &schema.OpenAIUsage{
 				PromptTokens:     0,
 				CompletionTokens: 0,
 				TotalTokens:      0,
--- a/core/http/endpoints/openai/inpainting.go
+++ b/core/http/endpoints/openai/inpainting.go
@@ -258,7 +258,7 @@ func InpaintingEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, app
 			Data: []schema.Item{{
 				URL: imgPath,
 			}},
-			Usage: schema.OpenAIUsage{
+			Usage: &schema.OpenAIUsage{
 				PromptTokens:     0,
 				CompletionTokens: 0,
 				TotalTokens:      0,
--- a/core/http/endpoints/openai/realtime.go
+++ b/core/http/endpoints/openai/realtime.go
@@ -54,6 +54,30 @@ const (
 		"Avoid parenthetical asides, URLs, and anything that cannot be clearly vocalized."
 )

+// resolveOutputModalities returns the effective output modalities for a
+// response: response-level overrides session-level, and the OpenAI Realtime
+// spec default is ["audio"] when neither is set.
+func resolveOutputModalities(session, response []types.Modality) []types.Modality {
+	if len(response) > 0 {
+		return response
+	}
+	if len(session) > 0 {
+		return session
+	}
+	return []types.Modality{types.ModalityAudio}
+}
+
+// modalitiesContainAudio reports whether the resolved modalities include audio
+// output.
+func modalitiesContainAudio(m []types.Modality) bool {
+	for _, x := range m {
+		if x == types.ModalityAudio {
+			return true
+		}
+	}
+	return false
+}
+
 // A model can be "emulated" that is: transcribe audio to text -> feed text to the LLM -> generate audio as result
 // If the model support instead audio-to-audio, we will use the specific gRPC calls instead

@@ -82,6 +106,10 @@ type Session struct {
 	InputSampleRate  int
 	OutputSampleRate int
 	MaxOutputTokens  types.IntOrInf
+	// OutputModalities mirrors the OpenAI Realtime spec field of the same
+	// name. Empty means "use the spec default" (audio). ["text"] suppresses
+	// TTS so the client receives only response.output_text.* events.
+	OutputModalities []types.Modality
 	// MaxHistoryItems caps the number of MessageItems passed to the LLM each
 	// turn (0 = unlimited). Small models — especially the LFM2.5-Audio 1.5B
 	// served via the liquid-audio backend — degrade quickly past a handful
@@ -162,13 +190,14 @@ func (s *Session) ToServer() types.SessionUnion {
 	} else {
 		return types.SessionUnion{
 			Realtime: &types.RealtimeSession{
-				ID:              s.ID,
-				Object:          "realtime.session",
-				Model:           s.Model,
-				Instructions:    s.Instructions,
-				Tools:           s.Tools,
-				ToolChoice:      s.ToolChoice,
-				MaxOutputTokens: s.MaxOutputTokens,
+				ID:               s.ID,
+				Object:           "realtime.session",
+				Model:            s.Model,
+				Instructions:     s.Instructions,
+				Tools:            s.Tools,
+				ToolChoice:       s.ToolChoice,
+				MaxOutputTokens:  s.MaxOutputTokens,
+				OutputModalities: s.OutputModalities,
 				Audio: &types.RealtimeSessionAudio{
 					Input: &types.SessionAudioInput{
 						TurnDetection: s.TurnDetection,
@@ -1015,6 +1044,10 @@ func updateSession(session *Session, update *types.SessionUnion, cl *config.Mode
 		session.MaxOutputTokens = rt.MaxOutputTokens
 	}

+	if len(rt.OutputModalities) > 0 {
+		session.OutputModalities = rt.OutputModalities
+	}
+
 	return nil
 }

@@ -1654,106 +1687,130 @@ func triggerResponseAtTurn(ctx context.Context, session *Session, conv *Conversa
 			})
 		}

-		// Check for cancellation before TTS
-		if ctx.Err() != nil {
-			xlog.Debug("Response cancelled before TTS (barge-in)")
-			sendCancelledResponse()
-			return
-		}
-
-		audioFilePath, res, err := session.ModelInterface.TTS(ctx, finalSpeech, session.Voice, session.InputAudioTranscription.Language)
-		if err != nil {
-			if ctx.Err() != nil {
-				xlog.Debug("TTS cancelled (barge-in)")
-				sendCancelledResponse()
-				return
-			}
-			xlog.Error("TTS failed", "error", err)
-			sendError(t, "tts_error", fmt.Sprintf("TTS generation failed: %v", err), "", item.Assistant.ID)
-			return
-		}
-		if !res.Success {
-			xlog.Error("TTS failed", "message", res.Message)
-			sendError(t, "tts_error", fmt.Sprintf("TTS generation failed: %s", res.Message), "", item.Assistant.ID)
-			return
-		}
-		defer os.Remove(audioFilePath)
-
-		audioBytes, err := os.ReadFile(audioFilePath)
-		if err != nil {
-			xlog.Error("failed to read TTS file", "error", err)
-			sendError(t, "tts_error", fmt.Sprintf("Failed to read TTS audio: %v", err), "", item.Assistant.ID)
-			return
-		}
-
-		// Parse WAV header to get raw PCM and the actual sample rate from the TTS backend.
-		pcmData, ttsSampleRate := laudio.ParseWAV(audioBytes)
-		if ttsSampleRate == 0 {
-			ttsSampleRate = localSampleRate
-		}
-		xlog.Debug("TTS audio parsed", "raw_bytes", len(audioBytes), "pcm_bytes", len(pcmData), "sample_rate", ttsSampleRate)
-
-		// SendAudio (WebRTC) passes PCM at the TTS sample rate directly to the
-		// Opus encoder, which resamples to 48kHz internally. This avoids a
-		// lossy intermediate resample through 16kHz.
-		// XXX: This is a noop in websocket mode; it's included in the JSON instead
-		if err := t.SendAudio(ctx, pcmData, ttsSampleRate); err != nil {
-			if ctx.Err() != nil {
-				xlog.Debug("Audio playback cancelled (barge-in)")
-				sendCancelledResponse()
-				return
-			}
-			xlog.Error("failed to send audio via transport", "error", err)
-		}
-
-		_, isWebRTC := t.(*WebRTCTransport)
-
-		// For WebSocket clients, resample to the session's output rate and
-		// deliver audio as base64 in JSON events. WebRTC clients already
-		// received audio over the RTP track, so skip the base64 payload.
 		var audioString string
-		if !isWebRTC {
-			wsPCM := pcmData
-			if ttsSampleRate != session.OutputSampleRate {
-				samples := sound.BytesToInt16sLE(pcmData)
-				resampled := sound.ResampleInt16(samples, ttsSampleRate, session.OutputSampleRate)
-				wsPCM = sound.Int16toBytesLE(resampled)
-			}
-			audioString = base64.StdEncoding.EncodeToString(wsPCM)
+		_, isWebRTC := t.(*WebRTCTransport)
+		var respMods []types.Modality
+		if overrides != nil {
+			respMods = overrides.OutputModalities
 		}
+		modalities := resolveOutputModalities(session.OutputModalities, respMods)
+		if modalitiesContainAudio(modalities) {
+			// Check for cancellation before TTS
+			if ctx.Err() != nil {
+				xlog.Debug("Response cancelled before TTS (barge-in)")
+				sendCancelledResponse()
+				return
+			}

-		sendEvent(t, types.ResponseOutputAudioTranscriptDeltaEvent{
-			ServerEventBase: types.ServerEventBase{},
-			ResponseID:      responseID,
-			ItemID:          item.Assistant.ID,
-			OutputIndex:     0,
-			ContentIndex:    0,
-			Delta:           finalSpeech,
-		})
-		sendEvent(t, types.ResponseOutputAudioTranscriptDoneEvent{
-			ServerEventBase: types.ServerEventBase{},
-			ResponseID:      responseID,
-			ItemID:          item.Assistant.ID,
-			OutputIndex:     0,
-			ContentIndex:    0,
-			Transcript:      finalSpeech,
-		})
+			audioFilePath, res, err := session.ModelInterface.TTS(ctx, finalSpeech, session.Voice, session.InputAudioTranscription.Language)
+			if err != nil {
+				if ctx.Err() != nil {
+					xlog.Debug("TTS cancelled (barge-in)")
+					sendCancelledResponse()
+					return
+				}
+				xlog.Error("TTS failed", "error", err)
+				sendError(t, "tts_error", fmt.Sprintf("TTS generation failed: %v", err), "", item.Assistant.ID)
+				return
+			}
+			if !res.Success {
+				xlog.Error("TTS failed", "message", res.Message)
+				sendError(t, "tts_error", fmt.Sprintf("TTS generation failed: %s", res.Message), "", item.Assistant.ID)
+				return
+			}
+			defer func() { _ = os.Remove(audioFilePath) }()

-		if !isWebRTC {
-			sendEvent(t, types.ResponseOutputAudioDeltaEvent{
+			audioBytes, err := os.ReadFile(audioFilePath)
+			if err != nil {
+				xlog.Error("failed to read TTS file", "error", err)
+				sendError(t, "tts_error", fmt.Sprintf("Failed to read TTS audio: %v", err), "", item.Assistant.ID)
+				return
+			}
+
+			// Parse WAV header to get raw PCM and the actual sample rate from the TTS backend.
+			pcmData, ttsSampleRate := laudio.ParseWAV(audioBytes)
+			if ttsSampleRate == 0 {
+				ttsSampleRate = localSampleRate
+			}
+			xlog.Debug("TTS audio parsed", "raw_bytes", len(audioBytes), "pcm_bytes", len(pcmData), "sample_rate", ttsSampleRate)
+
+			// SendAudio (WebRTC) passes PCM at the TTS sample rate directly to the
+			// Opus encoder, which resamples to 48kHz internally. This avoids a
+			// lossy intermediate resample through 16kHz.
+			// XXX: This is a noop in websocket mode; it's included in the JSON instead
+			if err := t.SendAudio(ctx, pcmData, ttsSampleRate); err != nil {
+				if ctx.Err() != nil {
+					xlog.Debug("Audio playback cancelled (barge-in)")
+					sendCancelledResponse()
+					return
+				}
+				xlog.Error("failed to send audio via transport", "error", err)
+			}
+
+			// For WebSocket clients, resample to the session's output rate and
+			// deliver audio as base64 in JSON events. WebRTC clients already
+			// received audio over the RTP track, so skip the base64 payload.
+			if !isWebRTC {
+				wsPCM := pcmData
+				if ttsSampleRate != session.OutputSampleRate {
+					samples := sound.BytesToInt16sLE(pcmData)
+					resampled := sound.ResampleInt16(samples, ttsSampleRate, session.OutputSampleRate)
+					wsPCM = sound.Int16toBytesLE(resampled)
+				}
+				audioString = base64.StdEncoding.EncodeToString(wsPCM)
+			}
+
+			sendEvent(t, types.ResponseOutputAudioTranscriptDeltaEvent{
 				ServerEventBase: types.ServerEventBase{},
 				ResponseID:      responseID,
 				ItemID:          item.Assistant.ID,
 				OutputIndex:     0,
 				ContentIndex:    0,
-				Delta:           audioString,
+				Delta:           finalSpeech,
 			})
-			sendEvent(t, types.ResponseOutputAudioDoneEvent{
+			sendEvent(t, types.ResponseOutputAudioTranscriptDoneEvent{
 				ServerEventBase: types.ServerEventBase{},
 				ResponseID:      responseID,
 				ItemID:          item.Assistant.ID,
 				OutputIndex:     0,
 				ContentIndex:    0,
+				Transcript:      finalSpeech,
+			})
+
+			if !isWebRTC {
+				sendEvent(t, types.ResponseOutputAudioDeltaEvent{
+					ServerEventBase: types.ServerEventBase{},
+					ResponseID:      responseID,
+					ItemID:          item.Assistant.ID,
+					OutputIndex:     0,
+					ContentIndex:    0,
+					Delta:           audioString,
+				})
+				sendEvent(t, types.ResponseOutputAudioDoneEvent{
+					ServerEventBase: types.ServerEventBase{},
+					ResponseID:      responseID,
+					ItemID:          item.Assistant.ID,
+					OutputIndex:     0,
+					ContentIndex:    0,
+				})
+			}
+		} else {
+			// Text-only mode: skip TTS, emit only the text events.
+			sendEvent(t, types.ResponseOutputTextDeltaEvent{
+				ServerEventBase: types.ServerEventBase{},
+				ResponseID:      responseID,
+				ItemID:          item.Assistant.ID,
+				OutputIndex:     0,
+				ContentIndex:    0,
+				Delta:           finalSpeech,
+			})
+			sendEvent(t, types.ResponseOutputTextDoneEvent{
+				ServerEventBase: types.ServerEventBase{},
+				ResponseID:      responseID,
+				ItemID:          item.Assistant.ID,
+				OutputIndex:     0,
+				ContentIndex:    0,
+				Text:            finalSpeech,
 			})
 		}

--- a/core/http/endpoints/openai/realtime_modality_test.go
+++ b/core/http/endpoints/openai/realtime_modality_test.go
@@ -0,0 +1,39 @@
+package openai
+
+import (
+	"github.com/mudler/LocalAI/core/http/endpoints/openai/types"
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+var _ = Describe("resolveOutputModalities", func() {
+	It("defaults to audio when neither session nor response specify", func() {
+		got := resolveOutputModalities(nil, nil)
+		Expect(got).To(ConsistOf(types.ModalityAudio))
+	})
+
+	It("uses session modalities when response omits them", func() {
+		sess := []types.Modality{types.ModalityText}
+		got := resolveOutputModalities(sess, nil)
+		Expect(got).To(ConsistOf(types.ModalityText))
+	})
+
+	It("response modalities override session", func() {
+		sess := []types.Modality{types.ModalityAudio}
+		resp := []types.Modality{types.ModalityText}
+		got := resolveOutputModalities(sess, resp)
+		Expect(got).To(ConsistOf(types.ModalityText))
+	})
+
+	It("returns false from modalitiesContainAudio for text-only", func() {
+		Expect(modalitiesContainAudio([]types.Modality{types.ModalityText})).To(BeFalse())
+	})
+
+	It("returns true from modalitiesContainAudio for audio (default)", func() {
+		Expect(modalitiesContainAudio([]types.Modality{types.ModalityAudio})).To(BeTrue())
+	})
+
+	It("returns true when both audio and text are present", func() {
+		Expect(modalitiesContainAudio([]types.Modality{types.ModalityText, types.ModalityAudio})).To(BeTrue())
+	})
+})
--- a/core/http/react-ui/src/hooks/useChat.js
+++ b/core/http/react-ui/src/hooks/useChat.js
@@ -255,7 +255,10 @@ export function useChat(initialModel = '') {
    )
    messages.push(...historyForApi, { role: 'user', content: messageContent })

-    const requestBody = { model, messages, stream: true }
+    // include_usage tells LocalAI to emit a trailing chunk with token totals;
+    // without it the spec-compliant server drops `usage` from the stream and
+    // the token-count badge would never populate.
+    const requestBody = { model, messages, stream: true, stream_options: { include_usage: true } }
    if (temperature !== null && temperature !== undefined) requestBody.temperature = temperature
    if (topP !== null && topP !== undefined) requestBody.top_p = topP
    if (topK !== null && topK !== undefined) requestBody.top_k = topK
--- a/core/http/static/chat.js
+++ b/core/http/static/chat.js
@@ -1212,6 +1212,9 @@ async function promptGPT(systemPrompt, input) {

  // Add stream parameter for both regular chat and MCP (MCP now supports SSE streaming)
  requestBody.stream = true;
+  // include_usage tells LocalAI to emit a trailing chunk with token totals;
+  // the spec-compliant server otherwise drops `usage` from the stream.
+  requestBody.stream_options = { include_usage: true };
  
  // Add generation parameters if they are set (null means use default)
  if (activeChat.temperature !== null && activeChat.temperature !== undefined) {
--- a/core/schema/ollama.go
+++ b/core/schema/ollama.go
@@ -2,6 +2,8 @@ package schema

 import (
 	"context"
+	"encoding/json"
+	"fmt"
 	"time"
 )

@@ -18,6 +20,79 @@ type OllamaOptions struct {
 	NumCtx        int      `json:"num_ctx,omitempty"`
 }

+// UnmarshalJSON accepts integer parameters encoded as either JSON ints
+// (`8192`) or JSON floats (`8192.0`). Some clients - notably Home Assistant's
+// Ollama integration - serialize ints as floats, which stdlib json refuses
+// to decode into int fields. See https://github.com/mudler/LocalAI/issues/9837.
+func (o *OllamaOptions) UnmarshalJSON(data []byte) error {
+	type aux struct {
+		Temperature   *float64     `json:"temperature,omitempty"`
+		TopP          *float64     `json:"top_p,omitempty"`
+		TopK          *json.Number `json:"top_k,omitempty"`
+		NumPredict    *json.Number `json:"num_predict,omitempty"`
+		RepeatPenalty float64      `json:"repeat_penalty,omitempty"`
+		RepeatLastN   *json.Number `json:"repeat_last_n,omitempty"`
+		Seed          *json.Number `json:"seed,omitempty"`
+		Stop          []string     `json:"stop,omitempty"`
+		NumCtx        *json.Number `json:"num_ctx,omitempty"`
+	}
+	var a aux
+	if err := json.Unmarshal(data, &a); err != nil {
+		return err
+	}
+
+	o.Temperature = a.Temperature
+	o.TopP = a.TopP
+	o.RepeatPenalty = a.RepeatPenalty
+	o.Stop = a.Stop
+
+	var err error
+	if o.TopK, err = jsonNumberToIntPtr(a.TopK); err != nil {
+		return fmt.Errorf("options.top_k: %w", err)
+	}
+	if o.NumPredict, err = jsonNumberToIntPtr(a.NumPredict); err != nil {
+		return fmt.Errorf("options.num_predict: %w", err)
+	}
+	if o.Seed, err = jsonNumberToIntPtr(a.Seed); err != nil {
+		return fmt.Errorf("options.seed: %w", err)
+	}
+	if o.RepeatLastN, err = jsonNumberToInt(a.RepeatLastN); err != nil {
+		return fmt.Errorf("options.repeat_last_n: %w", err)
+	}
+	if o.NumCtx, err = jsonNumberToInt(a.NumCtx); err != nil {
+		return fmt.Errorf("options.num_ctx: %w", err)
+	}
+	return nil
+}
+
+// jsonNumberToInt parses a json.Number literal as an int, tolerating both
+// integer (`8192`) and float (`8192.0`) encodings. A nil pointer or empty
+// string yields 0, matching the zero-value semantics of the int fields.
+func jsonNumberToInt(n *json.Number) (int, error) {
+	if n == nil || *n == "" {
+		return 0, nil
+	}
+	if i, err := n.Int64(); err == nil {
+		return int(i), nil
+	}
+	f, err := n.Float64()
+	if err != nil {
+		return 0, err
+	}
+	return int(f), nil
+}
+
+func jsonNumberToIntPtr(n *json.Number) (*int, error) {
+	if n == nil {
+		return nil, nil
+	}
+	i, err := jsonNumberToInt(n)
+	if err != nil {
+		return nil, err
+	}
+	return &i, nil
+}
+
 // OllamaMessage represents a message in Ollama chat format
 type OllamaMessage struct {
 	Role      string   `json:"role"`
--- a/core/schema/ollama_test.go
+++ b/core/schema/ollama_test.go
@@ -84,3 +84,94 @@ var _ = Describe("OllamaEmbedRequest", func() {
 		})
 	})
 })
+
+// Several Ollama clients (notably Home Assistant's Python client) encode
+// integer parameters as JSON floats (`8192.0`). Stdlib json refuses to
+// unmarshal those into `int` fields, so OllamaOptions has a custom
+// UnmarshalJSON that accepts both forms. See
+// https://github.com/mudler/LocalAI/issues/9837.
+var _ = Describe("OllamaOptions JSON unmarshaling", func() {
+	It("accepts integer literals for int fields", func() {
+		body := []byte(`{"num_ctx": 8192, "num_predict": 256, "top_k": 40, "seed": 7, "repeat_last_n": 64}`)
+
+		var opts OllamaOptions
+		Expect(json.Unmarshal(body, &opts)).To(Succeed())
+
+		Expect(opts.NumCtx).To(Equal(8192))
+		Expect(opts.NumPredict).NotTo(BeNil())
+		Expect(*opts.NumPredict).To(Equal(256))
+		Expect(opts.TopK).NotTo(BeNil())
+		Expect(*opts.TopK).To(Equal(40))
+		Expect(opts.Seed).NotTo(BeNil())
+		Expect(*opts.Seed).To(Equal(7))
+		Expect(opts.RepeatLastN).To(Equal(64))
+	})
+
+	It("accepts float literals for int fields (Home Assistant Ollama client)", func() {
+		body := []byte(`{"num_ctx": 8192.0, "num_predict": 256.0, "top_k": 40.0, "seed": 7.0, "repeat_last_n": 64.0}`)
+
+		var opts OllamaOptions
+		Expect(json.Unmarshal(body, &opts)).To(Succeed())
+
+		Expect(opts.NumCtx).To(Equal(8192))
+		Expect(opts.NumPredict).NotTo(BeNil())
+		Expect(*opts.NumPredict).To(Equal(256))
+		Expect(opts.TopK).NotTo(BeNil())
+		Expect(*opts.TopK).To(Equal(40))
+		Expect(opts.Seed).NotTo(BeNil())
+		Expect(*opts.Seed).To(Equal(7))
+		Expect(opts.RepeatLastN).To(Equal(64))
+	})
+
+	It("preserves float fields and stop list", func() {
+		body := []byte(`{"temperature": 0.7, "top_p": 0.9, "repeat_penalty": 1.1, "stop": ["<|end|>", "</s>"]}`)
+
+		var opts OllamaOptions
+		Expect(json.Unmarshal(body, &opts)).To(Succeed())
+
+		Expect(opts.Temperature).NotTo(BeNil())
+		Expect(*opts.Temperature).To(Equal(0.7))
+		Expect(opts.TopP).NotTo(BeNil())
+		Expect(*opts.TopP).To(Equal(0.9))
+		Expect(opts.RepeatPenalty).To(Equal(1.1))
+		Expect(opts.Stop).To(Equal([]string{"<|end|>", "</s>"}))
+	})
+
+	It("leaves optional int fields nil when absent", func() {
+		body := []byte(`{}`)
+
+		var opts OllamaOptions
+		Expect(json.Unmarshal(body, &opts)).To(Succeed())
+
+		Expect(opts.NumPredict).To(BeNil())
+		Expect(opts.TopK).To(BeNil())
+		Expect(opts.Seed).To(BeNil())
+		Expect(opts.NumCtx).To(Equal(0))
+		Expect(opts.RepeatLastN).To(Equal(0))
+	})
+
+	It("accepts nested options on a chat request with float num_ctx", func() {
+		// Mirrors the payload Home Assistant sends; reproduces issue #9837.
+		body := []byte(`{
+			"model": "qwen2",
+			"messages": [{"role": "user", "content": "hi"}],
+			"options": {"num_ctx": 8192.0, "top_k": 40.0}
+		}`)
+
+		var req OllamaChatRequest
+		Expect(json.Unmarshal(body, &req)).To(Succeed())
+
+		Expect(req.Options).NotTo(BeNil())
+		Expect(req.Options.NumCtx).To(Equal(8192))
+		Expect(req.Options.TopK).NotTo(BeNil())
+		Expect(*req.Options.TopK).To(Equal(40))
+	})
+
+	It("rejects non-numeric values with a clear error", func() {
+		body := []byte(`{"num_ctx": "not-a-number"}`)
+
+		var opts OllamaOptions
+		err := json.Unmarshal(body, &opts)
+		Expect(err).To(HaveOccurred())
+	})
+})
--- a/core/schema/openai.go
+++ b/core/schema/openai.go
@@ -82,7 +82,21 @@ type OpenAIResponse struct {
 	Choices []Choice `json:"choices,omitempty"`
 	Data    []Item   `json:"data,omitempty"`

-	Usage OpenAIUsage `json:"usage"`
+	// Usage is intentionally a pointer with omitempty: per the OpenAI
+	// chat-completion streaming spec, intermediate chunks must not carry
+	// a `usage` field. Marshalling a value-typed usage would emit
+	// `"usage":{"prompt_tokens":0,...}` on every chunk and break
+	// OpenAI-SDK consumers that filter on a truthy `result.usage`
+	// (continuedev/continue, Kilo Code, Roo Code, etc.).
+	Usage *OpenAIUsage `json:"usage,omitempty"`
+}
+
+// StreamOptions mirrors OpenAI's `stream_options` request field. The only
+// member currently honored is IncludeUsage; when true, the streaming
+// chat-completion response emits a trailing chunk with `choices:[]` and a
+// populated `usage` object.
+type StreamOptions struct {
+	IncludeUsage bool `json:"include_usage,omitempty" yaml:"include_usage,omitempty"`
 }

 type Choice struct {
@@ -198,6 +212,9 @@ type OpenAIRequest struct {

 	Stream bool `json:"stream"`

+	// StreamOptions opts into OpenAI streaming extensions, e.g. include_usage.
+	StreamOptions *StreamOptions `json:"stream_options,omitempty" yaml:"stream_options,omitempty"`
+
 	// Image (not supported by OpenAI)
 	Quality string `json:"quality"`
 	Step    int    `json:"step"`
--- a/core/services/galleryop/backends.go
+++ b/core/services/galleryop/backends.go
@@ -113,7 +113,7 @@ func (g *GalleryService) backendHandler(op *ManagementOp[gallery.GalleryBackend,
 // InstallExternalBackend installs a backend from an external source (OCI image, URL, or path).
 // This method contains the logic to detect the input type and call the appropriate installation function.
 // It can be used by both CLI and Web UI for installing backends from external sources.
-func InstallExternalBackend(ctx context.Context, galleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, downloadStatus func(string, string, string, float64), backend, name, alias string) error {
+func InstallExternalBackend(ctx context.Context, galleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, downloadStatus func(string, string, string, float64), backend, name, alias string, requireIntegrity bool) error {
 	uri := downloader.URI(backend)
 	switch {
 	case uri.LooksLikeDir():
@@ -127,7 +127,7 @@ func InstallExternalBackend(ctx context.Context, galleries []config.Gallery, sys
 			},
 			Alias: alias,
 			URI:   backend,
-		}, downloadStatus); err != nil {
+		}, downloadStatus, requireIntegrity); err != nil {
 			return fmt.Errorf("error installing backend %s: %w", backend, err)
 		}
 	case uri.LooksLikeOCI() && !uri.LooksLikeOCIFile():
@@ -141,7 +141,7 @@ func InstallExternalBackend(ctx context.Context, galleries []config.Gallery, sys
 			},
 			Alias: alias,
 			URI:   backend,
-		}, downloadStatus); err != nil {
+		}, downloadStatus, requireIntegrity); err != nil {
 			return fmt.Errorf("error installing backend %s: %w", backend, err)
 		}
 	case uri.LooksLikeOCIFile():
@@ -163,7 +163,7 @@ func InstallExternalBackend(ctx context.Context, galleries []config.Gallery, sys
 			},
 			Alias: alias,
 			URI:   backend,
-		}, downloadStatus); err != nil {
+		}, downloadStatus, requireIntegrity); err != nil {
 			return fmt.Errorf("error installing backend %s: %w", backend, err)
 		}
 	default:
@@ -171,7 +171,7 @@ func InstallExternalBackend(ctx context.Context, galleries []config.Gallery, sys
 		if name != "" || alias != "" {
 			return fmt.Errorf("specifying a name or alias is not supported for gallery backends")
 		}
-		err := gallery.InstallBackendFromGallery(ctx, galleries, systemState, modelLoader, backend, downloadStatus, true)
+		err := gallery.InstallBackendFromGallery(ctx, galleries, systemState, modelLoader, backend, downloadStatus, true, requireIntegrity)
 		if err != nil {
 			return fmt.Errorf("error installing backend %s: %w", backend, err)
 		}
--- a/core/services/galleryop/backends_test.go
+++ b/core/services/galleryop/backends_test.go
@@ -70,6 +70,7 @@ var _ = Describe("InstallExternalBackend", func() {
 				"test-backend", // gallery name
 				"custom-name",  // name should not be allowed
 				"",
+				false,
 			)
 			Expect(err).To(HaveOccurred())
 			Expect(err.Error()).To(ContainSubstring("specifying a name or alias is not supported for gallery backends"))
@@ -85,6 +86,7 @@ var _ = Describe("InstallExternalBackend", func() {
 				"non-existent-backend",
 				"",
 				"",
+				false,
 			)
 			Expect(err).To(HaveOccurred())
 		})
@@ -101,6 +103,7 @@ var _ = Describe("InstallExternalBackend", func() {
 				"oci://quay.io/mudler/tests:localai-backend-test",
 				"", // name is required for OCI images
 				"",
+				false,
 			)
 			Expect(err).To(HaveOccurred())
 			Expect(err.Error()).To(ContainSubstring("specifying a name is required for OCI images"))
@@ -133,6 +136,7 @@ var _ = Describe("InstallExternalBackend", func() {
 				testBackendPath,
 				"", // name should be inferred as "source-backend"
 				"",
+				false,
 			)
 			// The function should at least attempt to install with the inferred name
 			// Even if it fails for other reasons, it shouldn't fail due to missing name
@@ -151,6 +155,7 @@ var _ = Describe("InstallExternalBackend", func() {
 				testBackendPath,
 				"custom-backend-name",
 				"",
+				false,
 			)
 			// The function should use the provided name
 			if err != nil {
@@ -168,6 +173,7 @@ var _ = Describe("InstallExternalBackend", func() {
 				testBackendPath,
 				"custom-backend-name",
 				"custom-alias",
+				false,
 			)
 			// The function should accept alias for directory paths
 			if err != nil {
--- a/core/services/galleryop/list_models.go
+++ b/core/services/galleryop/list_models.go
@@ -16,6 +16,14 @@ const (

 func ListModels(bcl *config.ModelConfigLoader, ml *model.ModelLoader, filter config.ModelConfigFilterFn, looseFilePolicy LooseFilePolicy) ([]string, error) {

+	// Callers (e.g. the Ollama /api/tags handler) pass nil to mean "no
+	// filtering". Without this guard the loose-file loop below dereferences
+	// filter and panics, which Echo surfaces to clients as a dropped
+	// connection (see issue #9817).
+	if filter == nil {
+		filter = config.NoFilterFn
+	}
+
 	skipMap := map[string]struct{}{}

 	dataModels := []string{}
--- a/core/services/galleryop/list_models_test.go
+++ b/core/services/galleryop/list_models_test.go
@@ -0,0 +1,64 @@
+package galleryop_test
+
+import (
+	"os"
+	"path/filepath"
+
+	"github.com/mudler/LocalAI/core/config"
+	"github.com/mudler/LocalAI/core/services/galleryop"
+	"github.com/mudler/LocalAI/pkg/model"
+	"github.com/mudler/LocalAI/pkg/system"
+
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+// Regression test for issue #9817: the Ollama /api/tags handler calls
+// ListModels with a nil filter, which used to panic as soon as a loose file
+// existed under ModelsPath. The panic surfaced to Ollama clients (e.g. Home
+// Assistant) as "Server disconnected without sending a response".
+var _ = Describe("ListModels", func() {
+	var (
+		tempDir     string
+		bcl         *config.ModelConfigLoader
+		ml          *model.ModelLoader
+		systemState *system.SystemState
+	)
+
+	BeforeEach(func() {
+		var err error
+		tempDir, err = os.MkdirTemp("", "list-models-test-*")
+		Expect(err).NotTo(HaveOccurred())
+
+		systemState, err = system.GetSystemState(system.WithModelPath(tempDir))
+		Expect(err).NotTo(HaveOccurred())
+		ml = model.NewModelLoader(systemState)
+		bcl = config.NewModelConfigLoader(tempDir)
+	})
+
+	AfterEach(func() {
+		os.RemoveAll(tempDir)
+	})
+
+	It("does not panic with a nil filter when loose files exist", func() {
+		// ListFilesInModelPath skips well-known weight-file extensions
+		// (.gguf, .bin, ...) so use an extension-less file to ensure the
+		// filter path is exercised.
+		Expect(os.WriteFile(filepath.Join(tempDir, "loose-model"), []byte("x"), 0o644)).To(Succeed())
+
+		var names []string
+		var err error
+		Expect(func() {
+			names, err = galleryop.ListModels(bcl, ml, nil, galleryop.SKIP_IF_CONFIGURED)
+		}).ToNot(Panic())
+		Expect(err).ToNot(HaveOccurred())
+		Expect(names).To(ContainElement("loose-model"))
+	})
+
+	It("does not panic with a nil filter when ModelsPath is empty", func() {
+		Expect(func() {
+			_, err := galleryop.ListModels(bcl, ml, nil, galleryop.SKIP_IF_CONFIGURED)
+			Expect(err).ToNot(HaveOccurred())
+		}).ToNot(Panic())
+	})
+})
--- a/core/services/galleryop/managers_local.go
+++ b/core/services/galleryop/managers_local.go
@@ -16,6 +16,7 @@ type LocalModelManager struct {
 	modelLoader                 *model.ModelLoader
 	enforcePredownloadScans     bool
 	automaticallyInstallBackend bool
+	requireBackendIntegrity     bool
 }

 // NewLocalModelManager creates a LocalModelManager from the application config.
@@ -25,6 +26,7 @@ func NewLocalModelManager(appConfig *config.ApplicationConfig, ml *model.ModelLo
 		modelLoader:                 ml,
 		enforcePredownloadScans:     appConfig.EnforcePredownloadScans,
 		automaticallyInstallBackend: appConfig.AutoloadBackendGalleries,
+		requireBackendIntegrity:     appConfig.RequireBackendIntegrity,
 	}
 }

@@ -53,32 +55,34 @@ func (m *LocalModelManager) InstallModel(ctx context.Context, op *ManagementOp[g
 		if m.automaticallyInstallBackend && installedModel.Backend != "" {
 			xlog.Debug("Installing backend", "backend", installedModel.Backend)
 			return gallery.InstallBackendFromGallery(ctx, op.BackendGalleries, m.systemState,
-				m.modelLoader, installedModel.Backend, progressCb, false)
+				m.modelLoader, installedModel.Backend, progressCb, false, m.requireBackendIntegrity)
 		}
 		return nil
 	case op.GalleryElementName != "":
 		return gallery.InstallModelFromGallery(ctx, op.Galleries, op.BackendGalleries,
 			m.systemState, m.modelLoader, op.GalleryElementName, op.Req, progressCb,
-			m.enforcePredownloadScans, m.automaticallyInstallBackend)
+			m.enforcePredownloadScans, m.automaticallyInstallBackend, m.requireBackendIntegrity)
 	default:
 		return installModelFromRemoteConfig(ctx, m.systemState, m.modelLoader, op.Req,
-			progressCb, m.enforcePredownloadScans, m.automaticallyInstallBackend, op.BackendGalleries)
+			progressCb, m.enforcePredownloadScans, m.automaticallyInstallBackend, op.BackendGalleries, m.requireBackendIntegrity)
 	}
 }

 // LocalBackendManager handles backend install/delete on the local instance.
 type LocalBackendManager struct {
-	systemState      *system.SystemState
-	modelLoader      *model.ModelLoader
-	backendGalleries []config.Gallery
+	systemState             *system.SystemState
+	modelLoader             *model.ModelLoader
+	backendGalleries        []config.Gallery
+	requireBackendIntegrity bool
 }

 // NewLocalBackendManager creates a LocalBackendManager from the application config.
 func NewLocalBackendManager(appConfig *config.ApplicationConfig, ml *model.ModelLoader) *LocalBackendManager {
 	return &LocalBackendManager{
-		systemState:      appConfig.SystemState,
-		modelLoader:      ml,
-		backendGalleries: appConfig.BackendGalleries,
+		systemState:             appConfig.SystemState,
+		modelLoader:             ml,
+		backendGalleries:        appConfig.BackendGalleries,
+		requireBackendIntegrity: appConfig.RequireBackendIntegrity,
 	}
 }

@@ -93,7 +97,7 @@ func (b *LocalBackendManager) ListBackends() (gallery.SystemBackends, error) {
 }

 func (b *LocalBackendManager) UpgradeBackend(ctx context.Context, name string, progressCb ProgressCallback) error {
-	return gallery.UpgradeBackend(ctx, b.systemState, b.modelLoader, b.backendGalleries, name, progressCb)
+	return gallery.UpgradeBackend(ctx, b.systemState, b.modelLoader, b.backendGalleries, name, progressCb, b.requireBackendIntegrity)
 }

 func (b *LocalBackendManager) CheckUpgrades(ctx context.Context) (map[string]gallery.UpgradeInfo, error) {
@@ -103,10 +107,10 @@ func (b *LocalBackendManager) CheckUpgrades(ctx context.Context) (map[string]gal
 func (b *LocalBackendManager) InstallBackend(ctx context.Context, op *ManagementOp[gallery.GalleryBackend, any], progressCb ProgressCallback) error {
 	if op.ExternalURI != "" {
 		return InstallExternalBackend(ctx, b.backendGalleries, b.systemState, b.modelLoader,
-			progressCb, op.ExternalURI, op.ExternalName, op.ExternalAlias)
+			progressCb, op.ExternalURI, op.ExternalName, op.ExternalAlias, b.requireBackendIntegrity)
 	}
 	return gallery.InstallBackendFromGallery(ctx, b.backendGalleries, b.systemState,
-		b.modelLoader, op.GalleryElementName, progressCb, true)
+		b.modelLoader, op.GalleryElementName, progressCb, true, b.requireBackendIntegrity)
 }

 func (b *LocalBackendManager) IsDistributed() bool { return false }
--- a/core/services/galleryop/models.go
+++ b/core/services/galleryop/models.go
@@ -123,7 +123,7 @@ func (g *GalleryService) modelHandler(op *ManagementOp[gallery.GalleryModel, gal
 	return nil
 }

-func installModelFromRemoteConfig(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, req gallery.GalleryModel, downloadStatus func(string, string, string, float64), enforceScan, automaticallyInstallBackend bool, backendGalleries []config.Gallery) error {
+func installModelFromRemoteConfig(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, req gallery.GalleryModel, downloadStatus func(string, string, string, float64), enforceScan, automaticallyInstallBackend bool, backendGalleries []config.Gallery, requireBackendIntegrity bool) error {
 	config, err := gallery.GetGalleryConfigFromURLWithContext[gallery.ModelConfig](ctx, req.URL, systemState.Model.ModelsPath)
 	if err != nil {
 		return err
@@ -137,7 +137,7 @@ func installModelFromRemoteConfig(ctx context.Context, systemState *system.Syste
 	}

 	if automaticallyInstallBackend && installedModel.Backend != "" {
-		if err := gallery.InstallBackendFromGallery(ctx, backendGalleries, systemState, modelLoader, installedModel.Backend, downloadStatus, false); err != nil {
+		if err := gallery.InstallBackendFromGallery(ctx, backendGalleries, systemState, modelLoader, installedModel.Backend, downloadStatus, false, requireBackendIntegrity); err != nil {
 			return err
 		}
 	}
@@ -150,23 +150,23 @@ type galleryModel struct {
 	ID                   string           `json:"id"`
 }

-func processRequests(systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, automaticallyInstallBackend bool, galleries []config.Gallery, backendGalleries []config.Gallery, requests []galleryModel) error {
+func processRequests(systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, automaticallyInstallBackend bool, galleries []config.Gallery, backendGalleries []config.Gallery, requests []galleryModel, requireBackendIntegrity bool) error {
 	ctx := context.Background()
 	var err error
 	for _, r := range requests {
 		utils.ResetDownloadTimers()
 		if r.ID == "" {
-			err = installModelFromRemoteConfig(ctx, systemState, modelLoader, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan, automaticallyInstallBackend, backendGalleries)
+			err = installModelFromRemoteConfig(ctx, systemState, modelLoader, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan, automaticallyInstallBackend, backendGalleries, requireBackendIntegrity)

 		} else {
 			err = gallery.InstallModelFromGallery(
-				ctx, galleries, backendGalleries, systemState, modelLoader, r.ID, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan, automaticallyInstallBackend)
+				ctx, galleries, backendGalleries, systemState, modelLoader, r.ID, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan, automaticallyInstallBackend, requireBackendIntegrity)
 		}
 	}
 	return err
 }

-func ApplyGalleryFromFile(systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, automaticallyInstallBackend bool, galleries []config.Gallery, backendGalleries []config.Gallery, s string) error {
+func ApplyGalleryFromFile(systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, automaticallyInstallBackend bool, galleries []config.Gallery, backendGalleries []config.Gallery, s string, requireBackendIntegrity bool) error {
 	dat, err := os.ReadFile(s)
 	if err != nil {
 		return err
@@ -177,15 +177,15 @@ func ApplyGalleryFromFile(systemState *system.SystemState, modelLoader *model.Mo
 		return err
 	}

-	return processRequests(systemState, modelLoader, enforceScan, automaticallyInstallBackend, galleries, backendGalleries, requests)
+	return processRequests(systemState, modelLoader, enforceScan, automaticallyInstallBackend, galleries, backendGalleries, requests, requireBackendIntegrity)
 }

-func ApplyGalleryFromString(systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, automaticallyInstallBackend bool, galleries []config.Gallery, backendGalleries []config.Gallery, s string) error {
+func ApplyGalleryFromString(systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, automaticallyInstallBackend bool, galleries []config.Gallery, backendGalleries []config.Gallery, s string, requireBackendIntegrity bool) error {
 	var requests []galleryModel
 	err := json.Unmarshal([]byte(s), &requests)
 	if err != nil {
 		return err
 	}

-	return processRequests(systemState, modelLoader, enforceScan, automaticallyInstallBackend, galleries, backendGalleries, requests)
+	return processRequests(systemState, modelLoader, enforceScan, automaticallyInstallBackend, galleries, backendGalleries, requests, requireBackendIntegrity)
 }
--- a/core/services/worker/config.go
+++ b/core/services/worker/config.go
@@ -22,10 +22,11 @@ type Config struct {
 	Addr      string `env:"LOCALAI_ADDR" help:"Address where this worker is reachable (host:port). Port is base for gRPC backends, port-1 for HTTP." group:"server"`
 	ServeAddr string `env:"LOCALAI_SERVE_ADDR" default:"0.0.0.0:50051" help:"(Advanced) gRPC base port bind address" group:"server" hidden:""`

-	BackendsPath       string `env:"LOCALAI_BACKENDS_PATH,BACKENDS_PATH" type:"path" default:"${basepath}/backends" help:"Path containing backends" group:"server"`
-	BackendsSystemPath string `env:"LOCALAI_BACKENDS_SYSTEM_PATH" type:"path" default:"/var/lib/local-ai/backends" help:"Path containing system backends" group:"server"`
-	BackendGalleries   string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"server" default:"${backends}"`
-	ModelsPath         string `env:"LOCALAI_MODELS_PATH,MODELS_PATH" type:"path" default:"${basepath}/models" help:"Path containing models" group:"server"`
+	BackendsPath            string `env:"LOCALAI_BACKENDS_PATH,BACKENDS_PATH" type:"path" default:"${basepath}/backends" help:"Path containing backends" group:"server"`
+	BackendsSystemPath      string `env:"LOCALAI_BACKENDS_SYSTEM_PATH" type:"path" default:"/var/lib/local-ai/backends" help:"Path containing system backends" group:"server"`
+	BackendGalleries        string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"server" default:"${backends}"`
+	ModelsPath              string `env:"LOCALAI_MODELS_PATH,MODELS_PATH" type:"path" default:"${basepath}/models" help:"Path containing models" group:"server"`
+	RequireBackendIntegrity bool   `env:"LOCALAI_REQUIRE_BACKEND_INTEGRITY,REQUIRE_BACKEND_INTEGRITY" help:"If true, reject backend installs without a configured signature verification policy (OCI URIs) or SHA256 (tarball/HTTP URIs)." group:"hardening" default:"false"`

 	// HTTP file transfer
 	HTTPAddr          string `env:"LOCALAI_HTTP_ADDR" default:"" help:"HTTP file transfer server address (default: gRPC port + 1)" group:"server" hidden:""`
--- a/core/services/worker/install.go
+++ b/core/services/worker/install.go
@@ -112,14 +112,14 @@ func (s *backendSupervisor) installBackend(req messaging.BackendInstallRequest,
 		if req.URI != "" {
 			xlog.Info("Installing backend from external URI", "backend", req.Backend, "uri", req.URI, "force", force)
 			if err := galleryop.InstallExternalBackend(
-				context.Background(), galleries, s.systemState, s.ml, nil, req.URI, req.Name, req.Alias,
+				context.Background(), galleries, s.systemState, s.ml, nil, req.URI, req.Name, req.Alias, s.cfg.RequireBackendIntegrity,
 			); err != nil {
 				return "", fmt.Errorf("installing backend from gallery: %w", err)
 			}
 		} else {
 			xlog.Info("Installing backend from gallery", "backend", req.Backend, "force", force)
 			if err := gallery.InstallBackendFromGallery(
-				context.Background(), galleries, s.systemState, s.ml, req.Backend, nil, force,
+				context.Background(), galleries, s.systemState, s.ml, req.Backend, nil, force, s.cfg.RequireBackendIntegrity,
 			); err != nil {
 				return "", fmt.Errorf("installing backend from gallery: %w", err)
 			}
@@ -167,7 +167,7 @@ func (s *backendSupervisor) upgradeBackend(req messaging.BackendUpgradeRequest)
 	if req.URI != "" {
 		xlog.Info("Upgrading backend from external URI", "backend", req.Backend, "uri", req.URI)
 		if err := galleryop.InstallExternalBackend(
-			context.Background(), galleries, s.systemState, s.ml, nil, req.URI, req.Name, req.Alias,
+			context.Background(), galleries, s.systemState, s.ml, nil, req.URI, req.Name, req.Alias, s.cfg.RequireBackendIntegrity,
 		); err != nil {
 			return fmt.Errorf("upgrading backend from external URI: %w", err)
 		}
@@ -175,6 +175,7 @@ func (s *backendSupervisor) upgradeBackend(req messaging.BackendUpgradeRequest)
 		xlog.Info("Upgrading backend from gallery", "backend", req.Backend)
 		if err := gallery.InstallBackendFromGallery(
 			context.Background(), galleries, s.systemState, s.ml, req.Backend, nil, true, /* force */
+			s.cfg.RequireBackendIntegrity,
 		); err != nil {
 			return fmt.Errorf("upgrading backend from gallery: %w", err)
 		}
--- a/core/startup/model_preload.go
+++ b/core/startup/model_preload.go
@@ -21,12 +21,12 @@ import (
 // InstallModels will preload models from the given list of URLs and galleries
 // It will download the model if it is not already present in the model path
 // It will also try to resolve if the model is an embedded model YAML configuration
-func InstallModels(ctx context.Context, galleryService *galleryop.GalleryService, galleries, backendGalleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, autoloadBackendGalleries bool, downloadStatus func(string, string, string, float64), models ...string) error {
+func InstallModels(ctx context.Context, galleryService *galleryop.GalleryService, galleries, backendGalleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, enforceScan, autoloadBackendGalleries, requireBackendIntegrity bool, downloadStatus func(string, string, string, float64), models ...string) error {
 	// create an error that groups all errors
 	var err error
 	for _, url := range models {
 		// Check if it's a model gallery, or print a warning
-		e, found := installModel(ctx, galleries, backendGalleries, url, systemState, modelLoader, downloadStatus, enforceScan, autoloadBackendGalleries)
+		e, found := installModel(ctx, galleries, backendGalleries, url, systemState, modelLoader, downloadStatus, enforceScan, autoloadBackendGalleries, requireBackendIntegrity)
 		if e != nil && found {
 			xlog.Error("[startup] failed installing model", "error", err, "model", url)
 			err = errors.Join(err, e)
@@ -82,7 +82,7 @@ func InstallModels(ctx context.Context, galleryService *galleryop.GalleryService
 	return err
 }

-func installModel(ctx context.Context, galleries, backendGalleries []config.Gallery, modelName string, systemState *system.SystemState, modelLoader *model.ModelLoader, downloadStatus func(string, string, string, float64), enforceScan, autoloadBackendGalleries bool) (error, bool) {
+func installModel(ctx context.Context, galleries, backendGalleries []config.Gallery, modelName string, systemState *system.SystemState, modelLoader *model.ModelLoader, downloadStatus func(string, string, string, float64), enforceScan, autoloadBackendGalleries, requireBackendIntegrity bool) (error, bool) {
 	models, err := gallery.AvailableGalleryModels(galleries, systemState)
 	if err != nil {
 		return err, false
@@ -98,7 +98,7 @@ func installModel(ctx context.Context, galleries, backendGalleries []config.Gall
 	}

 	xlog.Info("installing model", "model", modelName, "license", model.License)
-	err = gallery.InstallModelFromGallery(ctx, galleries, backendGalleries, systemState, modelLoader, modelName, gallery.GalleryModel{}, downloadStatus, enforceScan, autoloadBackendGalleries)
+	err = gallery.InstallModelFromGallery(ctx, galleries, backendGalleries, systemState, modelLoader, modelName, gallery.GalleryModel{}, downloadStatus, enforceScan, autoloadBackendGalleries, requireBackendIntegrity)
 	if err != nil {
 		return err, true
 	}
--- a/core/startup/model_preload_test.go
+++ b/core/startup/model_preload_test.go
@@ -47,7 +47,7 @@ var _ = Describe("Preload test", func() {
 			}, ml)
 			galleryService.Start(ctx, config.NewModelConfigLoader(tmpdir), systemState)

-			err := InstallModels(ctx, galleryService, []config.Gallery{}, []config.Gallery{}, systemState, ml, true, true, func(s1, s2, s3 string, f float64) {
+			err := InstallModels(ctx, galleryService, []config.Gallery{}, []config.Gallery{}, systemState, ml, true, true, false, func(s1, s2, s3 string, f float64) {
 				fmt.Println(s1, s2, s3, f)
 			}, url)
 			Expect(err).ToNot(HaveOccurred())
@@ -67,7 +67,7 @@ var _ = Describe("Preload test", func() {
 			}, ml)
 			galleryService.Start(ctx, config.NewModelConfigLoader(tmpdir), systemState)

-			err := InstallModels(ctx, galleryService, []config.Gallery{}, []config.Gallery{}, systemState, ml, true, true, func(s1, s2, s3 string, f float64) {
+			err := InstallModels(ctx, galleryService, []config.Gallery{}, []config.Gallery{}, systemState, ml, true, true, false, func(s1, s2, s3 string, f float64) {
 				fmt.Println(s1, s2, s3, f)
 			}, url)
 			Expect(err).ToNot(HaveOccurred())
--- a/docs/content/advanced/model-configuration.md
+++ b/docs/content/advanced/model-configuration.md
@@ -316,23 +316,132 @@ These are set via the `options:` array in the model configuration (format: `key:

 #### Speculative Type Values

-| Type | Description |
-|------|-------------|
-| `none` | No speculative decoding (default) |
-| `draft` | Draft model-based speculation (auto-set when `draft_model` is configured) |
-| `eagle3` | EAGLE3 draft model architecture |
-| `ngram_simple` | Simple self-speculative using token history |
-| `ngram_map_k` | N-gram with key-only map |
-| `ngram_map_k4v` | N-gram with keys and 4 m-gram values |
-| `ngram_mod` | Modified n-gram speculation |
-| `ngram_cache` | 3-level n-gram cache |
+The canonical names match upstream llama.cpp (dash-separated). For backward compatibility LocalAI also accepts the underscore-separated forms and the bare `draft` / `eagle3` aliases.

-Multiple types can be chained by passing a comma-separated list to `spec_type` (e.g. `spec_type:ngram_simple,ngram_mod`). The runtime tries them in order and accepts the first proposal that meets the acceptance criteria.
+| Type | Aliases accepted | Description |
+|------|------------------|-------------|
+| `none` | | No speculative decoding (default) |
+| `draft-simple` | `draft`, `draft_simple` | Draft model-based speculation (auto-set when `draft_model` is configured) |
+| `draft-eagle3` | `eagle3`, `draft_eagle3` | EAGLE3 draft model architecture |
+| `draft-mtp` | `draft_mtp` | Multi-Token Prediction. Reuses the target model's embedded MTP head; no separate draft GGUF required (`draft_model` can be omitted). |
+| `ngram-simple` | `ngram_simple` | Simple self-speculative using token history |
+| `ngram-map-k` | `ngram_map_k` | N-gram with key-only map |
+| `ngram-map-k4v` | `ngram_map_k4v` | N-gram with keys and 4 m-gram values |
+| `ngram-mod` | `ngram_mod` | Modified n-gram speculation |
+| `ngram-cache` | `ngram_cache` | 3-level n-gram cache |
+
+Multiple types can be chained by passing a comma-separated list to `spec_type` (e.g. `spec_type:ngram-simple,ngram-mod`). The runtime tries them in order and accepts the first proposal that meets the acceptance criteria.

 {{% notice note %}}
 Speculative decoding is automatically disabled when multimodal models (with `mmproj`) are active. The `n_draft` parameter can also be overridden per-request.
 {{% /notice %}}

+##### Multi-Token Prediction (MTP)
+
+`draft-mtp` enables [Multi-Token Prediction](https://github.com/ggml-org/llama.cpp/pull/22673) (ggml-org/llama.cpp#22673). MTP uses a small prediction head trained into the target model: the head runs alongside the main forward pass and proposes the next few tokens, which the target then verifies in a single batched step. Upstream reports ~1.85x-2.1x token throughput at ~72-82% draft acceptance on Qwen3.6 27B / 35B A3B.
+
+**Auto-detection (default).** When a GGUF declares an MTP head (the upstream `<arch>.nextn_predict_layers` metadata key, set by `convert_hf_to_gguf.py` for Qwen3.5/3.6 family models and similar), LocalAI auto-enables MTP with the following defaults:
+
+```yaml
+options:
+  - spec_type:draft-mtp
+  - spec_n_max:6
+  - spec_p_min:0.75
+```
+
+Detection runs both at **import time** (the `/import-model` UI / `POST /models/import-uri` flow range-fetches the GGUF header and writes the options into the generated YAML before you save it) and at **load time** (every llama-cpp model start re-checks the local header and appends the options if `spec_type` isn't already set). To opt out, set an explicit `spec_type:` / `speculative_type:` in your YAML - auto-detection always preserves the user value, including `spec_type:none`.
+
+**Two ways to load the MTP head:**
+
+1. **Embedded in the target GGUF** (the recommended path for LocalAI, and what auto-detection assumes). When `spec_type` includes `draft-mtp` and `draft_model` is empty, the backend builds the MTP draft context directly from the target model's weights. The GGUF must have been converted with the MTP tensors included.
+2. **Separate `mtp-*.gguf` sibling file.** If you point `draft_model` at the separate MTP-head GGUF that ships next to the main weights on HuggingFace, the backend will load it as a draft model. Note: upstream's `-hf` auto-discovery of `mtp-*.gguf` siblings is **not** wired into LocalAI's gRPC layer - you need to download the sibling file and configure `draft_model` explicitly.
+
+**Manual override knobs** (overlap with the auto-detect defaults above):
+
+| Option | Recommended | Notes |
+|--------|------------|-------|
+| `spec_type` | `draft-mtp` | Activates MTP. Can be chained with other types (see below). |
+| `spec_n_max` / `draft_max` | `2`-`6` | Number of draft tokens per step. Upstream's PR suggests 2-3 for the tightest acceptance window; LocalAI's auto-default is 6 to favour throughput on models with high acceptance. |
+| `spec_p_min` | `0.75` | Pinned because upstream marks the current default with a "change to 0.0f" TODO; locking it here keeps acceptance thresholds stable across future llama.cpp bumps. |
+| `mmproj_use_gpu` | `false` (or unset `mmproj`) | MTP has a prompt-processing overhead; if the model is non-vision, drop the mmproj entirely to save VRAM. |
+
+**Minimal config** (override-only, since auto-detection already covers this for MTP-capable GGUFs):
+
+```yaml
+name: qwen3-mtp
+backend: llama-cpp
+parameters:
+  model: qwen3-27b-with-mtp.gguf
+options:
+  - spec_type:draft-mtp
+  - spec_n_max:3
+```
+
+**With a separate MTP head file:**
+
+```yaml
+name: qwen3-mtp
+backend: llama-cpp
+parameters:
+  model: qwen3-27b.gguf
+  draft_model: qwen3-27b-mtp-head.gguf
+options:
+  - spec_type:draft-mtp
+  - spec_n_max:3
+```
+
+**Chaining MTP with n-gram fallback** (experimental, from the PR's usage notes - useful when MTP acceptance drops on highly repetitive output):
+
+```yaml
+options:
+  - spec_type:draft-mtp,ngram-mod
+  - spec_n_max:3
+  - spec_ngram_mod_n_match:24
+```
+
+Pre-converted GGUFs with MTP heads are published on the [ggml-org HuggingFace org](https://huggingface.co/ggml-org) (initially Qwen3.6 27B and Qwen3.6 35B A3B).
+
+### Reasoning Models (DeepSeek-R1, Qwen3, etc.)
+
+These load-time options control how the backend parses `<think>` reasoning blocks and how much budget the model is allowed for thinking. They are set per model via the `options:` array.
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `reasoning_format` | string | `deepseek` | Parser for reasoning/thinking blocks. One of `none`, `auto`, `deepseek`, `deepseek-legacy` (alias `deepseek_legacy`). |
+| `enable_reasoning` / `reasoning_budget` | int | `-1` | Reasoning budget in tokens: `-1` unlimited, `0` disabled, `>0` token cap for the thinking section. |
+| `prefill_assistant` | bool | `true` | When `false`, the trailing assistant message is not pre-filled by the chat template. |
+
+{{% notice note %}}
+This is the load-time reasoning configuration. The orthogonal per-request `enable_thinking` chat-template kwarg (set via the YAML `reasoning.disable` field) toggles thinking on/off per call without restarting the model.
+{{% /notice %}}
+
+### Multimodal Backend Options
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `mmproj_use_gpu` / `mmproj_offload` | bool | `true` | Set `false` to keep the multimodal projector on CPU (saves VRAM at cost of speed). |
+| `image_min_tokens` | int | `-1` | Minimum vision tokens per image. `-1` keeps the model default. |
+| `image_max_tokens` | int | `-1` | Maximum vision tokens per image. `-1` keeps the model default. |
+
+### Embedding & Reranking Backend Options
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `pooling_type` / `pooling` | string | auto | Pooling strategy for embeddings: `none`, `mean`, `cls`, `last`, `rank`. Reranking automatically uses `rank`. |
+| `embd_normalize` / `embedding_normalize` | int | `2` | Normalization: `-1` none, `0` max-abs, `1` taxicab, `2` Euclidean (L2), `>2` p-norm. |
+
+### Other Backend Tuning Options
+
+These llama.cpp options are passed through the `options:` array.
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `n_ubatch` / `ubatch` | int | same as `batch` | Physical batch size. Decouple from `n_batch` when an embedding/rerank workload needs a different value. |
+| `threads_batch` / `n_threads_batch` | int | same as `threads` | Threads used during prompt processing. `<= 0` means `hardware_concurrency()`. |
+| `direct_io` / `use_direct_io` | bool | `false` | Open the model with `O_DIRECT` (faster cold loads on NVMe; ignored if not supported). |
+| `verbosity` | int | `3` | llama.cpp internal log verbosity threshold. Higher = more verbose. |
+| `override_tensor` / `tensor_buft_overrides` | string | "" | Per-tensor buffer-type overrides for the main model. Format: `<tensor regex>=<buffer type>,<tensor regex>=<buffer type>,...`. Mirrors the existing `draft_override_tensor` syntax for the draft model. |
+
 ### Prompt Caching

 | Field | Type | Description |
--- a/docs/data/version.json
+++ b/docs/data/version.json
@@ -1,3 +1,3 @@
 {
-  "version": "v4.2.3"
+  "version": "v4.2.6"
 }
--- a/flake.lock
+++ b/flake.lock
@@ -0,0 +1,40 @@
+{
+  "nodes": {
+    "inference-defaults": {
+      "flake": false,
+      "locked": {
+        "narHash": "sha256-ygWIkY2xiUEWqAZQM4/0vBz8vWd/RKX5VBj7EHovU14=",
+        "type": "file",
+        "url": "https://raw.githubusercontent.com/unslothai/unsloth/main/studio/backend/assets/configs/inference_defaults.json"
+      },
+      "original": {
+        "type": "file",
+        "url": "https://raw.githubusercontent.com/unslothai/unsloth/main/studio/backend/assets/configs/inference_defaults.json"
+      }
+    },
+    "nixpkgs": {
+      "locked": {
+        "lastModified": 1777578337,
+        "narHash": "sha256-Ad49moKWeXtKBJNy2ebiTQUEgdLyvGmTeykAQ9xM+Z4=",
+        "owner": "NixOS",
+        "repo": "nixpkgs",
+        "rev": "15f4ee454b1dce334612fa6843b3e05cf546efab",
+        "type": "github"
+      },
+      "original": {
+        "owner": "NixOS",
+        "ref": "nixos-unstable",
+        "repo": "nixpkgs",
+        "type": "github"
+      }
+    },
+    "root": {
+      "inputs": {
+        "inference-defaults": "inference-defaults",
+        "nixpkgs": "nixpkgs"
+      }
+    }
+  },
+  "root": "root",
+  "version": 7
+}
--- a/flake.nix
+++ b/flake.nix
@@ -0,0 +1,61 @@
+# Made by Azteczek
+{
+  description = "LocalAI flake";
+
+  inputs = {
+    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
+    inference-defaults = {
+      url = "https://raw.githubusercontent.com/unslothai/unsloth/main/studio/backend/assets/configs/inference_defaults.json";
+      flake = false;
+    };
+  };
+
+  outputs = { self, nixpkgs, inference-defaults }:
+    let
+      system = "x86_64-linux";
+      pkgs = nixpkgs.legacyPackages.${system};
+    in {
+      packages.${system}.default = pkgs.buildGoModule {
+        pname = "localai";
+        version = "custom";
+        
+ 	src = ./sources;
+        proxyVendor = true;
+        vendorHash = "sha256-MdadwbUc2pwfpC9ScsiIfjGIcAOgcwSm6rt/KNlTIuA=";
+
+        nativeBuildInputs = with pkgs; [ 
+          pkg-config cmake gcc protobuf go-protobuf protoc-gen-go protoc-gen-go-grpc
+        ];
+
+        env = {
+          CGO_ENABLED = "0";
+        };
+
+        preBuild = ''
+          
+          PROTO_SOURCE_DIR=$(find . -name "*.proto" -printf "%h" -quit)
+          mkdir -p pkg/grpc/proto
+          ${pkgs.protobuf}/bin/protoc \
+            -I=$PROTO_SOURCE_DIR \
+            -I. \
+            --go_out=pkg/grpc/proto --go_opt=paths=source_relative \
+            --go-grpc_out=pkg/grpc/proto --go-grpc_opt=paths=source_relative \
+            $PROTO_SOURCE_DIR/*.proto
+
+          go mod edit -replace github.com/mudler/LocalAI/pkg/grpc/proto=./pkg/grpc/proto
+          
+          mkdir -p core/config/gen_inference_defaults
+          cp ${inference-defaults} core/config/gen_inference_defaults/inference_defaults.json
+          sed -i '/go:generate/d' core/config/inference_defaults.go || true
+        
+	'';
+
+        subPackages = [ "cmd/local-ai" ];
+        doCheck = false;
+
+        postInstall = ''
+          [ -f $out/bin/local-ai ] && mv $out/bin/local-ai $out/bin/localai
+        '';
+      };
+    };
+}
--- a/go.mod
+++ b/go.mod
@@ -55,6 +55,7 @@ require (
 	github.com/sashabaranov/go-openai v1.41.2
 	github.com/schollz/progressbar/v3 v3.19.0
 	github.com/shirou/gopsutil/v3 v3.24.5
+	github.com/sigstore/sigstore-go v1.1.4
 	github.com/streamer45/silero-vad-go v0.2.1
 	github.com/swaggo/echo-swagger v1.5.2
 	github.com/swaggo/swag v1.16.6
@@ -78,6 +79,7 @@ require (
 require (
 	filippo.io/bigmod v0.1.1-0.20260103110540-f8a47775ebe5 // indirect
 	filippo.io/keygen v0.0.0-20260114151900-8e2790ea4c5b // indirect
+	github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 // indirect
 	github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.9 // indirect
 	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.22 // indirect
 	github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.22 // indirect
@@ -93,27 +95,67 @@ require (
 	github.com/aws/aws-sdk-go-v2/service/sts v1.42.0 // indirect
 	github.com/aws/smithy-go v1.25.0 // indirect
 	github.com/bahlo/generic-list-go v0.2.0 // indirect
+	github.com/blang/semver v3.5.1+incompatible // indirect
 	github.com/buger/jsonparser v1.1.2 // indirect
+	github.com/cenkalti/backoff/v5 v5.0.3 // indirect
+	github.com/cyberphone/json-canonicalization v0.0.0-20241213102144-19d51d7fe467 // indirect
+	github.com/digitorus/pkcs7 v0.0.0-20230818184609-3a137a874352 // indirect
+	github.com/digitorus/timestamp v0.0.0-20231217203849-220c5c2851b7 // indirect
 	github.com/dunglas/httpsfv v1.1.0 // indirect
 	github.com/filecoin-project/go-clock v0.1.0 // indirect
 	github.com/go-jose/go-jose/v4 v4.1.4 // indirect
+	github.com/go-openapi/analysis v0.24.1 // indirect
+	github.com/go-openapi/errors v0.22.4 // indirect
+	github.com/go-openapi/loads v0.23.2 // indirect
+	github.com/go-openapi/runtime v0.29.2 // indirect
+	github.com/go-openapi/strfmt v0.25.0 // indirect
+	github.com/go-openapi/swag/cmdutils v0.25.4 // indirect
+	github.com/go-openapi/swag/conv v0.25.4 // indirect
+	github.com/go-openapi/swag/fileutils v0.25.4 // indirect
+	github.com/go-openapi/swag/jsonname v0.25.4 // indirect
+	github.com/go-openapi/swag/jsonutils v0.25.4 // indirect
+	github.com/go-openapi/swag/loading v0.25.4 // indirect
+	github.com/go-openapi/swag/mangling v0.25.4 // indirect
+	github.com/go-openapi/swag/netutils v0.25.4 // indirect
+	github.com/go-openapi/swag/stringutils v0.25.4 // indirect
+	github.com/go-openapi/swag/typeutils v0.25.4 // indirect
+	github.com/go-openapi/swag/yamlutils v0.25.4 // indirect
+	github.com/go-openapi/validate v0.25.1 // indirect
+	github.com/go-viper/mapstructure/v2 v2.4.0 // indirect
+	github.com/google/certificate-transparency-go v1.3.2 // indirect
+	github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.7 // indirect
+	github.com/in-toto/attestation v1.1.2 // indirect
+	github.com/in-toto/in-toto-golang v0.9.0 // indirect
 	github.com/invopop/jsonschema v0.13.0 // indirect
 	github.com/jinzhu/inflection v1.0.0 // indirect
 	github.com/jinzhu/now v1.1.5 // indirect
 	github.com/jolestar/go-commons-pool/v2 v2.1.2 // indirect
 	github.com/klippa-app/go-pdfium v1.19.2 // indirect
-	github.com/mattn/go-sqlite3 v1.14.24 // indirect
+	github.com/mattn/go-sqlite3 v1.14.28 // indirect
 	github.com/moby/moby/api v1.54.1 // indirect
 	github.com/moby/moby/client v0.4.0 // indirect
 	github.com/nats-io/nkeys v0.4.15 // indirect
 	github.com/nats-io/nuid v1.0.1 // indirect
+	github.com/oklog/ulid v1.3.1 // indirect
+	github.com/secure-systems-lab/go-securesystemslib v0.9.1 // indirect
+	github.com/shibumi/go-pathspec v1.3.0 // indirect
+	github.com/sigstore/protobuf-specs v0.5.1 // indirect
+	github.com/sigstore/rekor v1.4.3 // indirect
+	github.com/sigstore/rekor-tiles/v2 v2.0.1 // indirect
+	github.com/sigstore/sigstore v1.10.0 // indirect
+	github.com/sigstore/timestamp-authority/v2 v2.0.3 // indirect
 	github.com/standard-webhooks/standard-webhooks/libraries v0.0.0-20260508151727-1282bb917829 // indirect
 	github.com/stretchr/testify v1.11.1 // indirect
 	github.com/sv-tools/openapi v0.2.1 // indirect
 	github.com/swaggo/swag/v2 v2.0.0-rc4 // indirect
 	github.com/tetratelabs/wazero v1.11.0 // indirect
+	github.com/theupdateframework/go-tuf/v2 v2.3.0 // indirect
 	github.com/tmc/langchaingo v0.1.14 // indirect
+	github.com/transparency-dev/formats v0.0.0-20251017110053-404c0d5b696c // indirect
+	github.com/transparency-dev/merkle v0.0.2 // indirect
 	github.com/wk8/go-ordered-map/v2 v2.1.8 // indirect
+	go.mongodb.org/mongo-driver v1.17.6 // indirect
+	google.golang.org/genproto/googleapis/api v0.0.0-20260128011058-8636f8732409 // indirect
 	sigs.k8s.io/yaml v1.6.0 // indirect
 )

@@ -163,7 +205,7 @@ require (
 	github.com/gocolly/colly v1.2.0 // indirect
 	github.com/gofiber/fiber/v2 v2.52.13 // indirect
 	github.com/golang/protobuf v1.5.4 // indirect
-	github.com/gomarkdown/markdown v0.0.0-20250311123330-531bef5e742b // indirect
+	github.com/gomarkdown/markdown v0.0.0-20260411013819-759bbc3e3207 // indirect
 	github.com/google/go-github/v69 v69.2.0 // indirect
 	github.com/google/go-querystring v1.1.0 // indirect
 	github.com/jackc/pgpassfile v1.0.0 // indirect
@@ -332,10 +374,10 @@ require (
 	github.com/go-logr/logr v1.4.3 // indirect
 	github.com/go-logr/stdr v1.2.2 // indirect
 	github.com/go-ole/go-ole v1.3.0 // indirect
-	github.com/go-openapi/jsonpointer v0.21.0 // indirect
-	github.com/go-openapi/jsonreference v0.21.0 // indirect
-	github.com/go-openapi/spec v0.21.0 // indirect
-	github.com/go-openapi/swag v0.23.0 // indirect
+	github.com/go-openapi/jsonpointer v0.22.1 // indirect
+	github.com/go-openapi/jsonreference v0.21.3 // indirect
+	github.com/go-openapi/spec v0.22.1 // indirect
+	github.com/go-openapi/swag v0.25.4 // indirect
 	github.com/gogo/protobuf v1.3.2 // indirect
 	github.com/golang/groupcache v0.0.0-20241129210726-2c02b8208cf8 // indirect
 	github.com/golang/snappy v0.0.5-0.20231225225746-43d5d4cd4e0e // indirect
@@ -358,9 +400,8 @@ require (
 	github.com/jackpal/go-nat-pmp v1.0.2 // indirect
 	github.com/jaypipes/pcidb v1.1.1 // indirect
 	github.com/jbenet/go-temp-err-catcher v0.1.0 // indirect
-	github.com/josharian/intern v1.0.0 // indirect
-	github.com/klauspost/compress v1.18.5 // indirect
-	github.com/klauspost/pgzip v1.2.5 // indirect
+	github.com/klauspost/compress v1.18.5
+	github.com/klauspost/pgzip v1.2.6 // indirect
 	github.com/koron/go-ssdp v0.0.6 // indirect
 	github.com/libp2p/go-buffer-pool v0.1.0 // indirect
 	github.com/libp2p/go-cidranger v1.1.0 // indirect
@@ -377,7 +418,7 @@ require (
 	github.com/libp2p/zeroconf/v2 v2.2.0 // indirect
 	github.com/lucasb-eyer/go-colorful v1.3.0 // indirect
 	github.com/lufia/plan9stats v0.0.0-20250317134145-8bc96cf8fc35 // indirect
-	github.com/mailru/easyjson v0.7.7 // indirect
+	github.com/mailru/easyjson v0.9.0 // indirect
 	github.com/marten-seemann/tcp v0.0.0-20210406111302-dfbc87cc63fd // indirect
 	github.com/mattn/go-colorable v0.1.14 // indirect
 	github.com/mattn/go-isatty v0.0.20 // indirect
@@ -432,7 +473,7 @@ require (
 	github.com/smallnest/ringbuffer v0.0.0-20241116012123-461381446e3d // indirect
 	github.com/songgao/packets v0.0.0-20160404182456-549a10cd4091 // indirect
 	github.com/spaolacci/murmur3 v1.1.0 // indirect
-	github.com/spf13/cast v1.7.0 // indirect
+	github.com/spf13/cast v1.10.0 // indirect
 	github.com/tklauser/go-sysconf v0.3.16 // indirect
 	github.com/tklauser/numcpus v0.11.0 // indirect
 	github.com/ulikunitz/xz v0.5.14 // indirect
--- a/go.sum
+++ b/go.sum
@@ -18,15 +18,29 @@ cloud.google.com/go v0.74.0/go.mod h1:VV1xSbzvo+9QJOxLDaJfTjx5e+MePCpCWwvftOeQmW
 cloud.google.com/go v0.78.0/go.mod h1:QjdrLG0uq+YwhjoVOLsS1t7TW8fs36kLs4XO5R5ECHg=
 cloud.google.com/go v0.79.0/go.mod h1:3bzgcEeQlzbuEAYu4mrWhKqWjmpprinYgKJLgKHnbb8=
 cloud.google.com/go v0.81.0/go.mod h1:mk/AM35KwGk/Nm2YSeZbxXdrNK3KZOYHmLkOqC2V6E0=
+cloud.google.com/go v0.121.6 h1:waZiuajrI28iAf40cWgycWNgaXPO06dupuS+sgibK6c=
+cloud.google.com/go v0.121.6/go.mod h1:coChdst4Ea5vUpiALcYKXEpR1S9ZgXbhEzzMcMR66vI=
+cloud.google.com/go/auth v0.17.0 h1:74yCm7hCj2rUyyAocqnFzsAYXgJhrG26XCFimrc/Kz4=
+cloud.google.com/go/auth v0.17.0/go.mod h1:6wv/t5/6rOPAX4fJiRjKkJCvswLwdet7G8+UGXt7nCQ=
+cloud.google.com/go/auth/oauth2adapt v0.2.8 h1:keo8NaayQZ6wimpNSmW5OPc283g65QNIiLpZnkHRbnc=
+cloud.google.com/go/auth/oauth2adapt v0.2.8/go.mod h1:XQ9y31RkqZCcwJWNSx2Xvric3RrU88hAYYbjDWYDL+c=
 cloud.google.com/go/bigquery v1.0.1/go.mod h1:i/xbL2UlR5RvWAURpBYZTtm/cXjCha9lbfbpx4poX+o=
 cloud.google.com/go/bigquery v1.3.0/go.mod h1:PjpwJnslEMmckchkHFfq+HTD2DmtT67aNFKH1/VBDHE=
 cloud.google.com/go/bigquery v1.4.0/go.mod h1:S8dzgnTigyfTmLBfrtrhyYhwRxG72rYxvftPBK2Dvzc=
 cloud.google.com/go/bigquery v1.5.0/go.mod h1:snEHRnqQbz117VIFhE8bmtwIDY80NLUZUMb4Nv6dBIg=
 cloud.google.com/go/bigquery v1.7.0/go.mod h1://okPTzCYNXSlb24MZs83e2Do+h+VXtc4gLoIoXIAPc=
 cloud.google.com/go/bigquery v1.8.0/go.mod h1:J5hqkt3O0uAFnINi6JXValWIb1v0goeZM77hZzJN/fQ=
+cloud.google.com/go/compute/metadata v0.9.0 h1:pDUj4QMoPejqq20dK0Pg2N4yG9zIkYGdBtwLoEkH9Zs=
+cloud.google.com/go/compute/metadata v0.9.0/go.mod h1:E0bWwX5wTnLPedCKqk3pJmVgCBSM6qQI1yTBdEb3C10=
 cloud.google.com/go/datastore v1.0.0/go.mod h1:LXYbyblFSglQ5pkeyhO+Qmw7ukd3C+pD7TKLgZqpHYE=
 cloud.google.com/go/datastore v1.1.0/go.mod h1:umbIZjpQpHh4hmRpGhH4tLFup+FVzqBi1b3c64qFpCk=
 cloud.google.com/go/firestore v1.1.0/go.mod h1:ulACoGHTpvq5r8rxGJ4ddJZBZqakUQqClKRT5SZwBmk=
+cloud.google.com/go/iam v1.5.3 h1:+vMINPiDF2ognBJ97ABAYYwRgsaqxPbQDlMnbHMjolc=
+cloud.google.com/go/iam v1.5.3/go.mod h1:MR3v9oLkZCTlaqljW6Eb2d3HGDGK5/bDv93jhfISFvU=
+cloud.google.com/go/kms v1.23.2 h1:4IYDQL5hG4L+HzJBhzejUySoUOheh3Lk5YT4PCyyW6k=
+cloud.google.com/go/kms v1.23.2/go.mod h1:rZ5kK0I7Kn9W4erhYVoIRPtpizjunlrfU4fUkumUp8g=
+cloud.google.com/go/longrunning v0.6.7 h1:IGtfDWHhQCgCjwQjV9iiLnUta9LBCo8R9QmAFsS/PrE=
+cloud.google.com/go/longrunning v0.6.7/go.mod h1:EAFV3IZAKmM56TyiE6VAP3VoTzhZzySwI/YI1s/nRsY=
 cloud.google.com/go/pubsub v1.0.1/go.mod h1:R0Gpsv3s54REJCy4fxDixWD93lHJMoZTyQ2kNxGRt3I=
 cloud.google.com/go/pubsub v1.1.0/go.mod h1:EwwdRX2sKPjnvnqCa270oGRyludottCI76h+R3AArQw=
 cloud.google.com/go/pubsub v1.2.0/go.mod h1:jhfEVHT8odbXTkndysNHCcx0awwzvfOlguIAii9o8iA=
@@ -41,6 +55,8 @@ dario.cat/mergo v1.0.2/go.mod h1:E/hbnu0NxMFBjpMIE34DRGLWqDy0g5FuKDhCb31ngxA=
 dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9/go.mod h1:H6x//7gZCb22OMCxBHrMx7a5I7Hp++hsVxbQ4BYO7hU=
 filippo.io/bigmod v0.1.1-0.20260103110540-f8a47775ebe5 h1:JA0fFr+kxpqTdxR9LOBiTWpGNchqmkcsgmdeJZRclZ0=
 filippo.io/bigmod v0.1.1-0.20260103110540-f8a47775ebe5/go.mod h1:OjOXDNlClLblvXdwgFFOQFJEocLhhtai8vGLy0JCZlI=
+filippo.io/edwards25519 v1.1.0 h1:FNf4tywRC1HmFuKW5xopWpigGjJKiJSV0Cqo0cJWDaA=
+filippo.io/edwards25519 v1.1.0/go.mod h1:BxyFTGdWcka3PhytdK4V28tE5sGfRvvvRV7EaN4VDT4=
 filippo.io/keygen v0.0.0-20260114151900-8e2790ea4c5b h1:REI1FbdW71yO56Are4XAxD+OS/e+BQsB3gE4mZRQEXY=
 filippo.io/keygen v0.0.0-20260114151900-8e2790ea4c5b/go.mod h1:9nnw1SlYHYuPSo/3wjQzNjSbeHlq2NsKo5iEtfJPWP0=
 fyne.io/fyne/v2 v2.7.3 h1:xBT/iYbdnNHONWO38fZMBrVBiJG8rV/Jypmy4tVfRWE=
@@ -49,8 +65,22 @@ fyne.io/systray v1.12.0 h1:CA1Kk0e2zwFlxtc02L3QFSiIbxJ/P0n582YrZHT7aTM=
 fyne.io/systray v1.12.0/go.mod h1:RVwqP9nYMo7h5zViCBHri2FgjXF7H2cub7MAq4NSoLs=
 github.com/AdaLogics/go-fuzz-headers v0.0.0-20240806141605-e8a1dd7889d6 h1:He8afgbRMd7mFxO99hRNu+6tazq8nFF9lIwo9JFroBk=
 github.com/AdaLogics/go-fuzz-headers v0.0.0-20240806141605-e8a1dd7889d6/go.mod h1:8o94RPi1/7XTJvwPpRSzSUedZrtlirdB3r9Z20bi2f8=
+github.com/AdamKorcz/go-fuzz-headers-1 v0.0.0-20230919221257-8b5d3ce2d11d h1:zjqpY4C7H15HjRPEenkS4SAn3Jy2eRRjkjZbGR30TOg=
+github.com/AdamKorcz/go-fuzz-headers-1 v0.0.0-20230919221257-8b5d3ce2d11d/go.mod h1:XNqJ7hv2kY++g8XEHREpi+JqZo3+0l+CH2egBVN4yqM=
+github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0 h1:JXg2dwJUmPB9JmtVmdEB16APJ7jurfbY5jnfXpJoRMc=
+github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0/go.mod h1:YD5h/ldMsG0XiIw7PdyNhLxaM317eFh5yNLccNfGdyw=
+github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1 h1:Hk5QBxZQC1jb2Fwj6mpzme37xbCDdNTxU7O9eb5+LB4=
+github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1/go.mod h1:IYus9qsFobWIc2YVwe/WPjcnyCkPKtnHAqUYeebc8z0=
+github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 h1:9iefClla7iYpfYWdzPCRDozdmndjTm8DXdpCzPajMgA=
+github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2/go.mod h1:XtLgD3ZD34DAaVIIAyG3objl5DynM3CQ/vMcbBNJZGI=
+github.com/Azure/azure-sdk-for-go/sdk/security/keyvault/azkeys v1.4.0 h1:E4MgwLBGeVB5f2MdcIVD3ELVAWpr+WD6MUe1i+tM/PA=
+github.com/Azure/azure-sdk-for-go/sdk/security/keyvault/azkeys v1.4.0/go.mod h1:Y2b/1clN4zsAoUd/pgNAQHjLDnTis/6ROkUfyob6psM=
+github.com/Azure/azure-sdk-for-go/sdk/security/keyvault/internal v1.2.0 h1:nCYfgcSyHZXJI8J0IWE5MsCGlb2xp9fJiXyxWgmOFg4=
+github.com/Azure/azure-sdk-for-go/sdk/security/keyvault/internal v1.2.0/go.mod h1:ucUjca2JtSZboY8IoUqyQyuuXvwbMBVwFOm0vdQPNhA=
 github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c h1:udKWzYgxTojEKWjV8V+WSxDXJ4NFATAsZjh8iIbsQIg=
 github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
+github.com/AzureAD/microsoft-authentication-library-for-go v1.6.0 h1:XRzhVemXdgvJqCH0sFfrBUTnUJSBrBf7++ypk+twtRs=
+github.com/AzureAD/microsoft-authentication-library-for-go v1.6.0/go.mod h1:HKpQxkWaGLJ+D/5H8QRpyQXA1eKjxkFlOMwck5+33Jk=
 github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
 github.com/BurntSushi/toml v1.5.0 h1:W5quZX/G/csjUnuI8SUYlsHs9M38FC7znL0lIO+DvMg=
 github.com/BurntSushi/toml v1.5.0/go.mod h1:ukJfTF/6rtPPRCnwkur4qwRxa8vTRFBF0uk2lLoLwho=
@@ -86,6 +116,8 @@ github.com/alecthomas/kong v1.14.0 h1:gFgEUZWu2ZmZ+UhyZ1bDhuutbKN1nTtJTwh19Wsn21
 github.com/alecthomas/kong v1.14.0/go.mod h1:wrlbXem1CWqUV5Vbmss5ISYhsVPkBb1Yo7YKJghju2I=
 github.com/alecthomas/repr v0.5.2 h1:SU73FTI9D1P5UNtvseffFSGmdNci/O6RsqzeXJtP0Qs=
 github.com/alecthomas/repr v0.5.2/go.mod h1:Fr0507jx4eOXV7AlPV6AVZLYrLIuIeSOWtW57eE/O/4=
+github.com/alessio/shellescape v1.4.1 h1:V7yhSDDn8LP4lc4jS8pFkt0zCnzVJlG5JXy9BVKJUX0=
+github.com/alessio/shellescape v1.4.1/go.mod h1:PZAiSCk0LJaZkiCSkPv8qIobYglO3FPpyFjDCtHLS30=
 github.com/andybalholm/brotli v1.0.1/go.mod h1:loMXtMfwqflxFJPmdbJO0a3KNoPuLBgiu3qAvBg8x/Y=
 github.com/andybalholm/brotli v1.2.0 h1:ukwgCxwYrmACq68yiUqwIWnGY0cTPox/M94sVwToPjQ=
 github.com/andybalholm/brotli v1.2.0/go.mod h1:rzTDkvFWvIrjDXZHkuS16NPggd91W3kUSvPlQ1pLaKY=
@@ -108,6 +140,10 @@ github.com/armon/go-metrics v0.0.0-20180917152333-f0300d1749da/go.mod h1:Q73ZrmV
 github.com/armon/go-radix v0.0.0-20180808171621-7fddfc383310/go.mod h1:ufUuZ+zHj4x4TnLV4JWEpy2hxWSpsRywHrMgIH9cCH8=
 github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5 h1:0CwZNZbxp69SHPdPJAN/hZIm0C4OItdklCFmMRWYpio=
 github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5/go.mod h1:wHh0iHkYZB8zMSxRWpUBQtwG5a7fFgvEO+odwuTv2gs=
+github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 h1:DklsrG3dyBCFEj5IhUbnKptjxatkF07cF2ak3yi77so=
+github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2/go.mod h1:WaHUgvxTVq04UNunO+XhnAqY/wQc+bxr74GqbsZ/Jqw=
+github.com/aws/aws-sdk-go v1.55.7 h1:UJrkFq7es5CShfBwlWAC8DA077vp8PyVbQd3lqLiztE=
+github.com/aws/aws-sdk-go v1.55.7/go.mod h1:eRwEWoyTWFMVYVQzKMNHWP5/RV4xIUGMQfXQHfHkpNU=
 github.com/aws/aws-sdk-go-v2 v1.41.6 h1:1AX0AthnBQzMx1vbmir3Y4WsnJgiydmnJjiLu+LvXOg=
 github.com/aws/aws-sdk-go-v2 v1.41.6/go.mod h1:dy0UzBIfwSeot4grGvY1AqFWN5zgziMmWGzysDnHFcQ=
 github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.9 h1:adBsCIIpLbLmYnkQU+nAChU5yhVTvu5PerROm+/Kq2A=
@@ -132,6 +168,8 @@ github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.22 h1:PUmZeJU6
 github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.22/go.mod h1:nO6egFBoAaoXze24a2C0NjQCvdpk8OueRoYimvEB9jo=
 github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.22 h1:SE+aQ4DEqG53RRCAIHlCf//B2ycxGH7jFkpnAh/kKPM=
 github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.22/go.mod h1:ES3ynECd7fYeJIL6+oax+uIEljmfps0S70BaQzbMd/o=
+github.com/aws/aws-sdk-go-v2/service/kms v1.48.2 h1:aL8Y/AbB6I+uw0MjLbdo68NQ8t5lNs3CY3S848HpETk=
+github.com/aws/aws-sdk-go-v2/service/kms v1.48.2/go.mod h1:VJcNH6BLr+3VJwinRKdotLOMglHO8mIKlD3ea5c7hbw=
 github.com/aws/aws-sdk-go-v2/service/s3 v1.99.1 h1:kU/eBN5+MWNo/LcbNa4hWDdN76hdcd7hocU5kvu7IsU=
 github.com/aws/aws-sdk-go-v2/service/s3 v1.99.1/go.mod h1:Fw9aqhJicIVee1VytBBjH+l+5ov6/PhbtIK/u3rt/ls=
 github.com/aws/aws-sdk-go-v2/service/signin v1.0.10 h1:a1Fq/KXn75wSzoJaPQTgZO0wHGqE9mjFnylnqEPTchA=
@@ -161,6 +199,8 @@ github.com/bits-and-blooms/bitset v1.12.0/go.mod h1:7hO7Gc7Pp1vODcmWvKMRA9BNmbv6
 github.com/bits-and-blooms/bitset v1.24.0 h1:H4x4TuulnokZKvHLfzVRTHJfFfnHEeSYJizujEZvmAM=
 github.com/bits-and-blooms/bitset v1.24.0/go.mod h1:7hO7Gc7Pp1vODcmWvKMRA9BNmbv6a/7QIWpPxHddWR8=
 github.com/bketelsen/crypt v0.0.4/go.mod h1:aI6NrJ0pMGgvZKL1iVgXLnfIFJtfV+bKCoqOes/6LfM=
+github.com/blang/semver v3.5.1+incompatible h1:cQNTCjp13qL8KC3Nbxr/y2Bqb63oX6wdnnjpJbkM4JQ=
+github.com/blang/semver v3.5.1+incompatible/go.mod h1:kRBLl5iJ+tD4TcOOxsy/0fnwebNt5EWlYSAyrTnjyyk=
 github.com/blevesearch/bleve/v2 v2.5.7 h1:2d9YrL5zrX5EBBW++GOaEKjE+NPWeZGaX77IM26m1Z8=
 github.com/blevesearch/bleve/v2 v2.5.7/go.mod h1:yj0NlS7ocGC4VOSAedqDDMktdh2935v2CSWOCDMHdSA=
 github.com/blevesearch/bleve_index_api v1.2.11 h1:bXQ54kVuwP8hdrXUSOnvTQfgK0KI1+f9A0ITJT8tX1s=
@@ -208,6 +248,8 @@ github.com/canonical/go-sp800.90a-drbg v0.0.0-20210314144037-6eeb1040d6c3 h1:oe6
 github.com/canonical/go-sp800.90a-drbg v0.0.0-20210314144037-6eeb1040d6c3/go.mod h1:qdP0gaj0QtgX2RUZhnlVrceJ+Qln8aSlDyJwelLLFeM=
 github.com/cenkalti/backoff/v4 v4.3.0 h1:MyRJ/UdXutAwSAT+s3wNd7MfTIcy71VQueUuFK343L8=
 github.com/cenkalti/backoff/v4 v4.3.0/go.mod h1:Y3VNntkOUPxTVeUxJ/G5vcM//AlwfmyYozVcomhLiZE=
+github.com/cenkalti/backoff/v5 v5.0.3 h1:ZN+IMa753KfX5hd8vVaMixjnqRZ3y8CuJKRKj1xcsSM=
+github.com/cenkalti/backoff/v5 v5.0.3/go.mod h1:rkhZdG3JZukswDf7f0cwqPNk4K0sa+F97BxZthm/crw=
 github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU=
 github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
 github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
@@ -238,6 +280,8 @@ github.com/cloudflare/circl v1.6.3/go.mod h1:2eXP6Qfat4O/Yhh8BznvKnJ+uzEoTQ6jVKJ
 github.com/cncf/udpa/go v0.0.0-20191209042840-269d4d468f6f/go.mod h1:M8M6+tZqaGXZJjfX53e64911xZQV5JYwmTeXPW+k8Sc=
 github.com/cncf/udpa/go v0.0.0-20200629203442-efcf912fb354/go.mod h1:WmhPx2Nbnhtbo57+VJT5O0JRkEi1Wbu0z5j0R8u5Hbk=
 github.com/cncf/udpa/go v0.0.0-20201120205902-5459f2c99403/go.mod h1:WmhPx2Nbnhtbo57+VJT5O0JRkEi1Wbu0z5j0R8u5Hbk=
+github.com/codahale/rfc6979 v0.0.0-20141003034818-6a90f24967eb h1:EDmT6Q9Zs+SbUoc7Ik9EfrFqcylYqgPZ9ANSbTAntnE=
+github.com/codahale/rfc6979 v0.0.0-20141003034818-6a90f24967eb/go.mod h1:ZjrT6AXHbDs86ZSdt/osfBi5qfexBrKUdONk989Wnk4=
 github.com/containerd/cgroups v1.1.0 h1:v8rEWFl6EoqHB+swVNjVoCJE8o3jX7e8nqBGPLaDFBM=
 github.com/containerd/cgroups v1.1.0/go.mod h1:6ppBcbh/NOOUU+dMKrykgaBnK9lCIBxHqJDGwsa1mIw=
 github.com/containerd/containerd v1.7.31 h1:jn3IMuTV4Bb1Uwb0MFPW2ASJAD3W1lh6QqqZHIZwDh4=
@@ -269,8 +313,12 @@ github.com/creachadair/otp v0.5.0 h1:q3Th7CXm2zlmCdBjw5tEPFOj4oWJMnVL5HXlq0sNKS0
 github.com/creachadair/otp v0.5.0/go.mod h1:0kceI87EnYFNYSTL121goJVAnk3eJhaed9H0nMuJUkA=
 github.com/creack/pty v1.1.24 h1:bJrF4RRfyJnbTJqzRLHzcGaZK1NeM5kTC9jGgovnR1s=
 github.com/creack/pty v1.1.24/go.mod h1:08sCNb52WyoAwi2QDyzUCTgcvVFhUzewun7wtTfvcwE=
+github.com/cyberphone/json-canonicalization v0.0.0-20241213102144-19d51d7fe467 h1:uX1JmpONuD549D73r6cgnxyUu18Zb7yHAy5AYU0Pm4Q=
+github.com/cyberphone/json-canonicalization v0.0.0-20241213102144-19d51d7fe467/go.mod h1:uzvlm1mxhHkdfqitSA92i7Se+S9ksOn3a3qmv/kyOCw=
 github.com/cyphar/filepath-securejoin v0.6.1 h1:5CeZ1jPXEiYt3+Z6zqprSAgSWiggmpVyciv8syjIpVE=
 github.com/cyphar/filepath-securejoin v0.6.1/go.mod h1:A8hd4EnAeyujCJRrICiOWqjS1AX0a9kM5XL+NwKoYSc=
+github.com/danieljoos/wincred v1.2.2 h1:774zMFJrqaeYCK2W57BgAem/MLi6mtSE47MB6BOJ0i0=
+github.com/danieljoos/wincred v1.2.2/go.mod h1:w7w4Utbrz8lqeMbDAK0lkNJUv5sAOkFi7nd/ogr0Uh8=
 github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1VwoXQT9A3Wy9MM3WgvqSxFWenqJduM=
@@ -283,6 +331,11 @@ github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.1 h1:5RVFMOWjMyRy8cARdy79nAmgYw3h
 github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.1/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
 github.com/dhowden/tag v0.0.0-20240417053706-3d75831295e8 h1:OtSeLS5y0Uy01jaKK4mA/WVIYtpzVm63vLVAPzJXigg=
 github.com/dhowden/tag v0.0.0-20240417053706-3d75831295e8/go.mod h1:apkPC/CR3s48O2D7Y++n1XWEpgPNNCjXYga3PPbJe2E=
+github.com/digitorus/pkcs7 v0.0.0-20230713084857-e76b763bdc49/go.mod h1:SKVExuS+vpu2l9IoOc0RwqE7NYnb0JlcFHFnEJkVDzc=
+github.com/digitorus/pkcs7 v0.0.0-20230818184609-3a137a874352 h1:ge14PCmCvPjpMQMIAH7uKg0lrtNSOdpYsRXlwk3QbaE=
+github.com/digitorus/pkcs7 v0.0.0-20230818184609-3a137a874352/go.mod h1:SKVExuS+vpu2l9IoOc0RwqE7NYnb0JlcFHFnEJkVDzc=
+github.com/digitorus/timestamp v0.0.0-20231217203849-220c5c2851b7 h1:lxmTCgmHE1GUYL7P0MlNa00M67axePTq+9nBSGddR8I=
+github.com/digitorus/timestamp v0.0.0-20231217203849-220c5c2851b7/go.mod h1:GvWntX9qiTlOud0WkQ6ewFm0LPy5JUR1Xo0Ngbd1w6Y=
 github.com/distribution/reference v0.6.0 h1:0IXCQ5g4/QMHHkarYzh5l+u8T3t73zM5QvfrDyIgxBk=
 github.com/distribution/reference v0.6.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E=
 github.com/dlclark/regexp2 v1.11.5 h1:Q/sSnsKerHeCkc/jSTNq1oCm7KiVgUMZRDUoRu0JQZQ=
@@ -370,6 +423,9 @@ github.com/go-audio/riff v1.0.0 h1:d8iCGbDvox9BfLagY94fBynxSPHO80LmZCaOsmKxokA=
 github.com/go-audio/riff v1.0.0/go.mod h1:l3cQwc85y79NQFCRB7TiPoNiaijp6q8Z0Uv38rVG498=
 github.com/go-audio/wav v1.1.0 h1:jQgLtbqBzY7G+BM8fXF7AHUk1uHUviWS4X39d5rsL2g=
 github.com/go-audio/wav v1.1.0/go.mod h1:mpe9qfwbScEbkd8uybLuIpTgHyrISw/OTuvjUW2iGtE=
+github.com/go-chi/chi v4.1.2+incompatible h1:fGFk2Gmi/YKXk0OmGfBh0WgmN3XB8lVnEyNz34tQRec=
+github.com/go-chi/chi/v5 v5.2.3 h1:WQIt9uxdsAbgIYgid+BpYc+liqQZGMHRaUwp0JUcvdE=
+github.com/go-chi/chi/v5 v5.2.3/go.mod h1:L2yAIGWB3H+phAw1NxKwWM+7eUH/lU8pOMm5hHcoops=
 github.com/go-git/gcfg v1.5.1-0.20230307220236-3a3c6141e376 h1:+zs/tPmkDkHx3U66DAb0lQFJrpS6731Oaa12ikc+DiI=
 github.com/go-git/gcfg v1.5.1-0.20230307220236-3a3c6141e376/go.mod h1:an3vInlBmSxCcxctByoQdvwPiA7DTK7jaaFDBTtu0ic=
 github.com/go-git/go-billy/v5 v5.9.0 h1:jItGXszUDRtR/AlferWPTMN4j38BQ88XnXKbilmmBPA=
@@ -395,16 +451,58 @@ github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre
 github.com/go-ole/go-ole v1.2.6/go.mod h1:pprOEPIfldk/42T2oK7lQ4v4JSDwmV0As9GaiUsvbm0=
 github.com/go-ole/go-ole v1.3.0 h1:Dt6ye7+vXGIKZ7Xtk4s6/xVdGDQynvom7xCFEdWr6uE=
 github.com/go-ole/go-ole v1.3.0/go.mod h1:5LS6F96DhAwUc7C+1HLexzMXY1xGRSryjyPPKW6zv78=
-github.com/go-openapi/jsonpointer v0.21.0 h1:YgdVicSA9vH5RiHs9TZW5oyafXZFc6+2Vc1rr/O9oNQ=
-github.com/go-openapi/jsonpointer v0.21.0/go.mod h1:IUyH9l/+uyhIYQ/PXVA41Rexl+kOkAPDdXEYns6fzUY=
-github.com/go-openapi/jsonreference v0.21.0 h1:Rs+Y7hSXT83Jacb7kFyjn4ijOuVGSvOdF2+tg1TRrwQ=
-github.com/go-openapi/jsonreference v0.21.0/go.mod h1:LmZmgsrTkVg9LG4EaHeY8cBDslNPMo06cago5JNLkm4=
-github.com/go-openapi/spec v0.21.0 h1:LTVzPc3p/RzRnkQqLRndbAzjY0d0BCL72A6j3CdL9ZY=
-github.com/go-openapi/spec v0.21.0/go.mod h1:78u6VdPw81XU44qEWGhtr982gJ5BWg2c0I5XwVMotYk=
-github.com/go-openapi/swag v0.23.0 h1:vsEVJDUo2hPJ2tu0/Xc+4noaxyEffXNIs3cOULZ+GrE=
-github.com/go-openapi/swag v0.23.0/go.mod h1:esZ8ITTYEsH1V2trKHjAN8Ai7xHb8RV+YSZ577vPjgQ=
+github.com/go-openapi/analysis v0.24.1 h1:Xp+7Yn/KOnVWYG8d+hPksOYnCYImE3TieBa7rBOesYM=
+github.com/go-openapi/analysis v0.24.1/go.mod h1:dU+qxX7QGU1rl7IYhBC8bIfmWQdX4Buoea4TGtxXY84=
+github.com/go-openapi/errors v0.22.4 h1:oi2K9mHTOb5DPW2Zjdzs/NIvwi2N3fARKaTJLdNabaM=
+github.com/go-openapi/errors v0.22.4/go.mod h1:z9S8ASTUqx7+CP1Q8dD8ewGH/1JWFFLX/2PmAYNQLgk=
+github.com/go-openapi/jsonpointer v0.22.1 h1:sHYI1He3b9NqJ4wXLoJDKmUmHkWy/L7rtEo92JUxBNk=
+github.com/go-openapi/jsonpointer v0.22.1/go.mod h1:pQT9OsLkfz1yWoMgYFy4x3U5GY5nUlsOn1qSBH5MkCM=
+github.com/go-openapi/jsonreference v0.21.3 h1:96Dn+MRPa0nYAR8DR1E03SblB5FJvh7W6krPI0Z7qMc=
+github.com/go-openapi/jsonreference v0.21.3/go.mod h1:RqkUP0MrLf37HqxZxrIAtTWW4ZJIK1VzduhXYBEeGc4=
+github.com/go-openapi/loads v0.23.2 h1:rJXAcP7g1+lWyBHC7iTY+WAF0rprtM+pm8Jxv1uQJp4=
+github.com/go-openapi/loads v0.23.2/go.mod h1:IEVw1GfRt/P2Pplkelxzj9BYFajiWOtY2nHZNj4UnWY=
+github.com/go-openapi/runtime v0.29.2 h1:UmwSGWNmWQqKm1c2MGgXVpC2FTGwPDQeUsBMufc5Yj0=
+github.com/go-openapi/runtime v0.29.2/go.mod h1:biq5kJXRJKBJxTDJXAa00DOTa/anflQPhT0/wmjuy+0=
+github.com/go-openapi/spec v0.22.1 h1:beZMa5AVQzRspNjvhe5aG1/XyBSMeX1eEOs7dMoXh/k=
+github.com/go-openapi/spec v0.22.1/go.mod h1:c7aeIQT175dVowfp7FeCvXXnjN/MrpaONStibD2WtDA=
+github.com/go-openapi/strfmt v0.25.0 h1:7R0RX7mbKLa9EYCTHRcCuIPcaqlyQiWNPTXwClK0saQ=
+github.com/go-openapi/strfmt v0.25.0/go.mod h1:nNXct7OzbwrMY9+5tLX4I21pzcmE6ccMGXl3jFdPfn8=
+github.com/go-openapi/swag v0.25.4 h1:OyUPUFYDPDBMkqyxOTkqDYFnrhuhi9NR6QVUvIochMU=
+github.com/go-openapi/swag v0.25.4/go.mod h1:zNfJ9WZABGHCFg2RnY0S4IOkAcVTzJ6z2Bi+Q4i6qFQ=
+github.com/go-openapi/swag/cmdutils v0.25.4 h1:8rYhB5n6WawR192/BfUu2iVlxqVR9aRgGJP6WaBoW+4=
+github.com/go-openapi/swag/cmdutils v0.25.4/go.mod h1:pdae/AFo6WxLl5L0rq87eRzVPm/XRHM3MoYgRMvG4A0=
+github.com/go-openapi/swag/conv v0.25.4 h1:/Dd7p0LZXczgUcC/Ikm1+YqVzkEeCc9LnOWjfkpkfe4=
+github.com/go-openapi/swag/conv v0.25.4/go.mod h1:3LXfie/lwoAv0NHoEuY1hjoFAYkvlqI/Bn5EQDD3PPU=
+github.com/go-openapi/swag/fileutils v0.25.4 h1:2oI0XNW5y6UWZTC7vAxC8hmsK/tOkWXHJQH4lKjqw+Y=
+github.com/go-openapi/swag/fileutils v0.25.4/go.mod h1:cdOT/PKbwcysVQ9Tpr0q20lQKH7MGhOEb6EwmHOirUk=
+github.com/go-openapi/swag/jsonname v0.25.4 h1:bZH0+MsS03MbnwBXYhuTttMOqk+5KcQ9869Vye1bNHI=
+github.com/go-openapi/swag/jsonname v0.25.4/go.mod h1:GPVEk9CWVhNvWhZgrnvRA6utbAltopbKwDu8mXNUMag=
+github.com/go-openapi/swag/jsonutils v0.25.4 h1:VSchfbGhD4UTf4vCdR2F4TLBdLwHyUDTd1/q4i+jGZA=
+github.com/go-openapi/swag/jsonutils v0.25.4/go.mod h1:7OYGXpvVFPn4PpaSdPHJBtF0iGnbEaTk8AvBkoWnaAY=
+github.com/go-openapi/swag/jsonutils/fixtures_test v0.25.4 h1:IACsSvBhiNJwlDix7wq39SS2Fh7lUOCJRmx/4SN4sVo=
+github.com/go-openapi/swag/jsonutils/fixtures_test v0.25.4/go.mod h1:Mt0Ost9l3cUzVv4OEZG+WSeoHwjWLnarzMePNDAOBiM=
+github.com/go-openapi/swag/loading v0.25.4 h1:jN4MvLj0X6yhCDduRsxDDw1aHe+ZWoLjW+9ZQWIKn2s=
+github.com/go-openapi/swag/loading v0.25.4/go.mod h1:rpUM1ZiyEP9+mNLIQUdMiD7dCETXvkkC30z53i+ftTE=
+github.com/go-openapi/swag/mangling v0.25.4 h1:2b9kBJk9JvPgxr36V23FxJLdwBrpijI26Bx5JH4Hp48=
+github.com/go-openapi/swag/mangling v0.25.4/go.mod h1:6dxwu6QyORHpIIApsdZgb6wBk/DPU15MdyYj/ikn0Hg=
+github.com/go-openapi/swag/netutils v0.25.4 h1:Gqe6K71bGRb3ZQLusdI8p/y1KLgV4M/k+/HzVSqT8H0=
+github.com/go-openapi/swag/netutils v0.25.4/go.mod h1:m2W8dtdaoX7oj9rEttLyTeEFFEBvnAx9qHd5nJEBzYg=
+github.com/go-openapi/swag/stringutils v0.25.4 h1:O6dU1Rd8bej4HPA3/CLPciNBBDwZj9HiEpdVsb8B5A8=
+github.com/go-openapi/swag/stringutils v0.25.4/go.mod h1:GTsRvhJW5xM5gkgiFe0fV3PUlFm0dr8vki6/VSRaZK0=
+github.com/go-openapi/swag/typeutils v0.25.4 h1:1/fbZOUN472NTc39zpa+YGHn3jzHWhv42wAJSN91wRw=
+github.com/go-openapi/swag/typeutils v0.25.4/go.mod h1:Ou7g//Wx8tTLS9vG0UmzfCsjZjKhpjxayRKTHXf2pTE=
+github.com/go-openapi/swag/yamlutils v0.25.4 h1:6jdaeSItEUb7ioS9lFoCZ65Cne1/RZtPBZ9A56h92Sw=
+github.com/go-openapi/swag/yamlutils v0.25.4/go.mod h1:MNzq1ulQu+yd8Kl7wPOut/YHAAU/H6hL91fF+E2RFwc=
+github.com/go-openapi/testify/enable/yaml/v2 v2.0.2 h1:0+Y41Pz1NkbTHz8NngxTuAXxEodtNSI1WG1c/m5Akw4=
+github.com/go-openapi/testify/enable/yaml/v2 v2.0.2/go.mod h1:kme83333GCtJQHXQ8UKX3IBZu6z8T5Dvy5+CW3NLUUg=
+github.com/go-openapi/testify/v2 v2.0.2 h1:X999g3jeLcoY8qctY/c/Z8iBHTbwLz7R2WXd6Ub6wls=
+github.com/go-openapi/testify/v2 v2.0.2/go.mod h1:HCPmvFFnheKK2BuwSA0TbbdxJ3I16pjwMkYkP4Ywn54=
+github.com/go-openapi/validate v0.25.1 h1:sSACUI6Jcnbo5IWqbYHgjibrhhmt3vR6lCzKZnmAgBw=
+github.com/go-openapi/validate v0.25.1/go.mod h1:RMVyVFYte0gbSTaZ0N4KmTn6u/kClvAFp+mAVfS/DQc=
 github.com/go-skynet/go-llama.cpp v0.0.0-20240314183750-6a8041ef6b46 h1:lALhXzDkqtp12udlDLLg+ybXVMmL7Ox9tybqVLWxjPE=
 github.com/go-skynet/go-llama.cpp v0.0.0-20240314183750-6a8041ef6b46/go.mod h1:iub0ugfTnflE3rcIuqV2pQSo15nEw3GLW/utm5gyERo=
+github.com/go-sql-driver/mysql v1.9.3 h1:U/N249h2WzJ3Ukj8SowVFjdtZKfu9vlLZxjPXV1aweo=
+github.com/go-sql-driver/mysql v1.9.3/go.mod h1:qn46aNg1333BRMNU69Lq93t8du/dwxI64Gl8i5p1WMU=
 github.com/go-task/slim-sprig/v3 v3.0.0 h1:sUs3vkvUymDpBKi3qH1YSqBQk9+9D/8M2mN1vB6EwHI=
 github.com/go-task/slim-sprig/v3 v3.0.0/go.mod h1:W848ghGpv3Qj3dhTPRyJypKRiqCdHZiAzKg9hl15HA8=
 github.com/go-telegram/bot v1.17.0 h1:Hs0kGxSj97QFqOQP0zxduY/4tSx8QDzvNI9uVRS+zmY=
@@ -417,6 +515,8 @@ github.com/go-text/typesetting v0.3.3 h1:ihGNJU9KzdK2QRDy1Bm7FT5RFQoYb+3n3EIhI/4
 github.com/go-text/typesetting v0.3.3/go.mod h1:vIRUT25mLQaSh4C8H/lIsKppQz/Gdb8Pu/tNwpi52ts=
 github.com/go-text/typesetting-utils v0.0.0-20250618110550-c820a94c77b8 h1:4KCscI9qYWMGTuz6BpJtbUSRzcBrUSSE0ENMJbNSrFs=
 github.com/go-text/typesetting-utils v0.0.0-20250618110550-c820a94c77b8/go.mod h1:3/62I4La/HBRX9TcTpBj4eipLiwzf+vhI+7whTc9V7o=
+github.com/go-viper/mapstructure/v2 v2.4.0 h1:EBsztssimR/CONLSZZ04E8qAkxNYq4Qp9LvH92wZUgs=
+github.com/go-viper/mapstructure/v2 v2.4.0/go.mod h1:oJDH3BJKyqBA2TXFhDsKDGDTlndYOZ6rGS0BRZIxGhM=
 github.com/go-yaml/yaml v2.1.0+incompatible/go.mod h1:w2MrLa16VYP0jy6N7M5kHaCkaLENm+P+Tv+MfurjSw0=
 github.com/gobwas/glob v0.2.3 h1:A4xDbljILXROh+kObIiy5kIaPYD8e96x1tgBhUI5J+Y=
 github.com/gobwas/glob v0.2.3/go.mod h1:d3Ez4x06l9bZtSvzIay5+Yzi0fmZzPgnTbPcKjJAkT8=
@@ -472,12 +572,14 @@ github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6
 github.com/golang/snappy v0.0.2/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
 github.com/golang/snappy v0.0.5-0.20231225225746-43d5d4cd4e0e h1:4bw4WeyTYPp0smaXiJZCNnLrvVBqirQVreixayXezGc=
 github.com/golang/snappy v0.0.5-0.20231225225746-43d5d4cd4e0e/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
-github.com/gomarkdown/markdown v0.0.0-20250311123330-531bef5e742b h1:EY/KpStFl60qA17CptGXhwfZ+k1sFNJIUNR8DdbcuUk=
-github.com/gomarkdown/markdown v0.0.0-20250311123330-531bef5e742b/go.mod h1:JDGcbDT52eL4fju3sZ4TeHGsQwhG9nbDV21aMyhwPoA=
+github.com/gomarkdown/markdown v0.0.0-20260411013819-759bbc3e3207 h1:p7t34F7K4OCRQblcDhNJnP46Uaarz3z2cLcvOZYxWn8=
+github.com/gomarkdown/markdown v0.0.0-20260411013819-759bbc3e3207/go.mod h1:JDGcbDT52eL4fju3sZ4TeHGsQwhG9nbDV21aMyhwPoA=
 github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
 github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
 github.com/google/btree v1.1.3 h1:CVpQJjYgC4VbzxeGVHfvZrv1ctoYCAI8vbl07Fcxlyg=
 github.com/google/btree v1.1.3/go.mod h1:qOPhT0dTNdNzV6Z/lhRX0YXUafgPLFUh+gZMl761Gm4=
+github.com/google/certificate-transparency-go v1.3.2 h1:9ahSNZF2o7SYMaKaXhAumVEzXB2QaayzII9C8rv7v+A=
+github.com/google/certificate-transparency-go v1.3.2/go.mod h1:H5FpMUaGa5Ab2+KCYsxg6sELw3Flkl7pGZzWdBoYLXs=
 github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M=
 github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
 github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
@@ -521,11 +623,19 @@ github.com/google/pprof v0.0.0-20210226084205-cbba55b83ad5/go.mod h1:kpwsk12EmLe
 github.com/google/pprof v0.0.0-20260115054156-294ebfa9ad83 h1:z2ogiKUYzX5Is6zr/vP9vJGqPwcdqsWjOt+V8J7+bTc=
 github.com/google/pprof v0.0.0-20260115054156-294ebfa9ad83/go.mod h1:MxpfABSjhmINe3F1It9d+8exIHFvUqtLIRCdOGNXqiI=
 github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
+github.com/google/s2a-go v0.1.9 h1:LGD7gtMgezd8a/Xak7mEWL0PjoTQFvpRudN895yqKW0=
+github.com/google/s2a-go v0.1.9/go.mod h1:YA0Ei2ZQL3acow2O62kdp9UlnvMmU7kA6Eutn0dXayM=
+github.com/google/trillian v1.7.2 h1:EPBxc4YWY4Ak8tcuhyFleY+zYlbCDCa4Sn24e1Ka8Js=
+github.com/google/trillian v1.7.2/go.mod h1:mfQJW4qRH6/ilABtPYNBerVJAJ/upxHLX81zxNQw05s=
 github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
 github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
 github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
+github.com/googleapis/enterprise-certificate-proxy v0.3.7 h1:zrn2Ee/nWmHulBx5sAVrGgAa0f2/R35S4DJwfFaUPFQ=
+github.com/googleapis/enterprise-certificate-proxy v0.3.7/go.mod h1:MkHOF77EYAE7qfSuSS9PU6g4Nt4e11cnsDUowfwewLA=
 github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg=
 github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk=
+github.com/googleapis/gax-go/v2 v2.15.0 h1:SyjDc1mGgZU5LncH8gimWo9lW1DtIfPibOG81vgd/bo=
+github.com/googleapis/gax-go/v2 v2.15.0/go.mod h1:zVVkkxAQHa1RQpg9z2AUCMnKhi0Qld9rcmyfL1OZhoc=
 github.com/gopherjs/gopherjs v0.0.0-20181017120253-0766667cb4d1/go.mod h1:wJfORRmW1u3UXTncJ5qlYoELFm8eSnnEO6hX4iZ3EWY=
 github.com/gopherjs/gopherjs v1.17.2 h1:fQnZVsXk8uxXIStYb0N4bGk7jeyTalG/wsZjQ25dO0g=
 github.com/gopherjs/gopherjs v1.17.2/go.mod h1:pRRIvn/QzFLrKfvEz3qUuEhtE/zLCWfreZ6J5gM2i+k=
@@ -536,7 +646,11 @@ github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674 h1:JeSE6pjso5T
 github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674/go.mod h1:r4w70xmWCQKmi1ONH4KIaBptdivuRPyosB9RmPlGEwA=
 github.com/gpustack/gguf-parser-go v0.24.0 h1:tdJceXYp9e5RhE9RwVYIuUpir72Jz2D68NEtDXkKCKc=
 github.com/gpustack/gguf-parser-go v0.24.0/go.mod h1:y4TwTtDqFWTK+xvprOjRUh+dowgU2TKCX37vRKvGiZ0=
+github.com/grpc-ecosystem/go-grpc-middleware v1.4.0 h1:UH//fgunKIs4JdUbpDl1VZCDaL56wXCB/5+wF6uHfaI=
+github.com/grpc-ecosystem/go-grpc-middleware v1.4.0/go.mod h1:g5qyo/la0ALbONm6Vbp88Yd8NsDy6rZz+RcrMPxvld8=
 github.com/grpc-ecosystem/grpc-gateway v1.16.0/go.mod h1:BDjrQk3hbvj6Nolgz8mAMFbcEtjT1g+wF4CSlocrBnw=
+github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.7 h1:X+2YciYSxvMQK0UZ7sg45ZVabVZBeBuvMkmuI2V3Fak=
+github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.7/go.mod h1:lW34nIZuQ8UDPdkon5fmfp2l3+ZkQ2me/+oecHYLOII=
 github.com/hack-pad/go-indexeddb v0.3.2 h1:DTqeJJYc1usa45Q5r52t01KhvlSN02+Oq+tQbSBI91A=
 github.com/hack-pad/go-indexeddb v0.3.2/go.mod h1:QvfTevpDVlkfomY498LhstjwbPW6QC4VC/lxYb0Kom0=
 github.com/hack-pad/safejs v0.1.0 h1:qPS6vjreAqh2amUqj4WNG1zIw7qlRQJ9K10eDKMCnE8=
@@ -544,12 +658,28 @@ github.com/hack-pad/safejs v0.1.0/go.mod h1:HdS+bKF1NrE72VoXZeWzxFOVQVUSqZJAG0xN
 github.com/hashicorp/consul/api v1.1.0/go.mod h1:VmuI/Lkw1nC05EYQWNKwWGbkg+FbDBtguAZLlVdkD9Q=
 github.com/hashicorp/consul/sdk v0.1.1/go.mod h1:VKf9jXwCTEY1QZP2MOLRhb5i/I/ssyNV1vwHyQBF0x8=
 github.com/hashicorp/errwrap v1.0.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
+github.com/hashicorp/errwrap v1.1.0 h1:OxrOeh75EUXMY8TBjag2fzXGZ40LB6IKw45YeGUDY2I=
+github.com/hashicorp/errwrap v1.1.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
 github.com/hashicorp/go-cleanhttp v0.5.1/go.mod h1:JpRdi6/HCYpAwUzNwuwqhbovhLtngrth3wmdIIUrZ80=
+github.com/hashicorp/go-cleanhttp v0.5.2 h1:035FKYIWjmULyFRBKPs8TBQoi0x6d9G4xc9neXJWAZQ=
+github.com/hashicorp/go-cleanhttp v0.5.2/go.mod h1:kO/YDlP8L1346E6Sodw+PrpBSV4/SoxCXGY6BqNFT48=
 github.com/hashicorp/go-immutable-radix v1.0.0/go.mod h1:0y9vanUI8NX6FsYoO3zeMjhV/C5i9g4Q3DwcSNZ4P60=
 github.com/hashicorp/go-msgpack v0.5.3/go.mod h1:ahLV/dePpqEmjfWmKiqvPkv/twdG7iPBM1vqhUKIvfM=
 github.com/hashicorp/go-multierror v1.0.0/go.mod h1:dHtQlpGsu+cZNNAkkCN/P3hoUDHhCYQXV3UM06sGGrk=
+github.com/hashicorp/go-multierror v1.1.1 h1:H5DkEtf6CXdFp0N0Em5UCwQpXMWke8IA0+lD48awMYo=
+github.com/hashicorp/go-multierror v1.1.1/go.mod h1:iw975J/qwKPdAO1clOe2L8331t/9/fmwbPZ6JB6eMoM=
+github.com/hashicorp/go-retryablehttp v0.7.8 h1:ylXZWnqa7Lhqpk0L1P1LzDtGcCR0rPVUrx/c8Unxc48=
+github.com/hashicorp/go-retryablehttp v0.7.8/go.mod h1:rjiScheydd+CxvumBsIrFKlx3iS0jrZ7LvzFGFmuKbw=
 github.com/hashicorp/go-rootcerts v1.0.0/go.mod h1:K6zTfqpRlCUIjkwsN4Z+hiSfzSTQa6eBIzfwKfwNnHU=
+github.com/hashicorp/go-rootcerts v1.0.2 h1:jzhAVGtqPKbwpyCPELlgNWhE1znq+qwJtW5Oi2viEzc=
+github.com/hashicorp/go-rootcerts v1.0.2/go.mod h1:pqUvnprVnM5bf7AOirdbb01K4ccR319Vf4pU3K5EGc8=
+github.com/hashicorp/go-secure-stdlib/parseutil v0.2.0 h1:U+kC2dOhMFQctRfhK0gRctKAPTloZdMU5ZJxaesJ/VM=
+github.com/hashicorp/go-secure-stdlib/parseutil v0.2.0/go.mod h1:Ll013mhdmsVDuoIXVfBtvgGJsXDYkTw1kooNcoCXuE0=
+github.com/hashicorp/go-secure-stdlib/strutil v0.1.2 h1:kes8mmyCpxJsI7FTwtzRqEy9CdjCtrXrXGuOpxEA7Ts=
+github.com/hashicorp/go-secure-stdlib/strutil v0.1.2/go.mod h1:Gou2R9+il93BqX25LAKCLuM+y9U2T4hlwvT1yprcna4=
 github.com/hashicorp/go-sockaddr v1.0.0/go.mod h1:7Xibr9yA9JjQq1JpNB2Vw7kxv8xerXegt+ozgdvDeDU=
+github.com/hashicorp/go-sockaddr v1.0.7 h1:G+pTkSO01HpR5qCxg7lxfsFEZaG+C0VssTy/9dbT+Fw=
+github.com/hashicorp/go-sockaddr v1.0.7/go.mod h1:FZQbEYa1pxkQ7WLpyXJ6cbjpT8q0YgQaK/JakXqGyWw=
 github.com/hashicorp/go-syslog v1.0.0/go.mod h1:qPfqrKkXGihmCqbJM2mZgkZGvKG1dFdvsLplgctolz4=
 github.com/hashicorp/go-uuid v1.0.0/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
 github.com/hashicorp/go-uuid v1.0.1/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
@@ -561,14 +691,20 @@ github.com/hashicorp/golang-lru v1.0.2/go.mod h1:iADmTwqILo4mZ8BN3D2Q6+9jd8WM5uG
 github.com/hashicorp/golang-lru/v2 v2.0.7 h1:a+bsQ5rvGLjzHuww6tVxozPZFVghXaHOwFs4luLUK2k=
 github.com/hashicorp/golang-lru/v2 v2.0.7/go.mod h1:QeFd9opnmA6QUJc5vARoKUSoFhyfM2/ZepoAG6RGpeM=
 github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ=
+github.com/hashicorp/hcl v1.0.1-vault-7 h1:ag5OxFVy3QYTFTJODRzTKVZ6xvdfLLCA1cy/Y6xGI0I=
+github.com/hashicorp/hcl v1.0.1-vault-7/go.mod h1:XYhtn6ijBSAj6n4YqAaf7RBPS4I06AItNorpy+MoQNM=
 github.com/hashicorp/logutils v1.0.0/go.mod h1:QIAnNjmIWmVIIkWDTG1z5v++HQmx9WQRO+LraFDTW64=
 github.com/hashicorp/mdns v1.0.0/go.mod h1:tL+uN++7HEJ6SQLQ2/p+z2pH24WQKWjBPkE0mNTz8vQ=
 github.com/hashicorp/memberlist v0.1.3/go.mod h1:ajVTdAv/9Im8oMAAj5G31PhhMCZJV2pPBoIllUwCN7I=
 github.com/hashicorp/serf v0.8.2/go.mod h1:6hOLApaqBFA1NXqRQAsxw9QxuDEvNxSQRwA/JwenrHc=
+github.com/hashicorp/vault/api v1.22.0 h1:+HYFquE35/B74fHoIeXlZIP2YADVboaPjaSicHEZiH0=
+github.com/hashicorp/vault/api v1.22.0/go.mod h1:IUZA2cDvr4Ok3+NtK2Oq/r+lJeXkeCrHRmqdyWfpmGM=
 github.com/henvic/httpretty v0.1.4 h1:Jo7uwIRWVFxkqOnErcoYfH90o3ddQyVrSANeS4cxYmU=
 github.com/henvic/httpretty v0.1.4/go.mod h1:Dn60sQTZfbt2dYsdUSNsCljyF4AfdqnuJFDLJA1I4AM=
 github.com/hexops/gotextdiff v1.0.3 h1:gitA9+qJrrTCsiCl7+kh75nPqQt1cx4ZkudSTLoUqJM=
 github.com/hexops/gotextdiff v1.0.3/go.mod h1:pSWU5MAI3yDq+fZBTazCSJysOMbxWL1BSow5/V2vxeg=
+github.com/howeyc/gopass v0.0.0-20210920133722-c8aef6fb66ef h1:A9HsByNhogrvm9cWb28sjiS3i7tcKCkflWFEkHfuAgM=
+github.com/howeyc/gopass v0.0.0-20210920133722-c8aef6fb66ef/go.mod h1:lADxMC39cJJqL93Duh1xhAs4I2Zs8mKS89XWXFGp9cs=
 github.com/hpcloud/tail v1.0.0 h1:nfCOvKYfkgYP8hkirhJocXT2+zOD8yUNjXaWfTlyFKI=
 github.com/hpcloud/tail v1.0.0/go.mod h1:ab1qPbhIpdTxEkNHXyeSf5vhxWSCs/tWer42PpOxQnU=
 github.com/huandu/xstrings v1.5.0 h1:2ag3IFq9ZDANvthTwTiqSSZLjDc+BedvHPAp5tJy2TI=
@@ -577,7 +713,13 @@ github.com/huin/goupnp v1.3.0 h1:UvLUlWDNpoUdYzb2TCn+MuTWtcjXKSza2n6CBdQ0xXc=
 github.com/huin/goupnp v1.3.0/go.mod h1:gnGPsThkYa7bFi/KWmEysQRf48l2dvR5bxr2OFckNX8=
 github.com/ianlancetaylor/demangle v0.0.0-20181102032728-5e5cf60278f6/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
 github.com/ianlancetaylor/demangle v0.0.0-20200824232613-28f6c0f3b639/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
+github.com/in-toto/attestation v1.1.2 h1:MBFn6lsMq6dptQZJBhalXTcWMb/aJy3V+GX3VYj/V1E=
+github.com/in-toto/attestation v1.1.2/go.mod h1:gYFddHMZj3DiQ0b62ltNi1Vj5rC879bTmBbrv9CRHpM=
+github.com/in-toto/in-toto-golang v0.9.0 h1:tHny7ac4KgtsfrG6ybU8gVOZux2H8jN05AXJ9EBM1XU=
+github.com/in-toto/in-toto-golang v0.9.0/go.mod h1:xsBVrVsHNsB61++S6Dy2vWosKhuA3lUTQd+eF9HdeMo=
 github.com/inconshreveable/mousetrap v1.0.0/go.mod h1:PxqpIevigyE2G7u3NXJIT2ANytuPF1OarO4DADm73n8=
+github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=
+github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
 github.com/invopop/jsonschema v0.13.0 h1:KvpoAJWEjR3uD9Kbm2HWJmqsEaHt8lBUpd0qHcIi21E=
 github.com/invopop/jsonschema v0.13.0/go.mod h1:ffZ5Km5SWWRAIN6wbDXItl95euhFz2uON45H2qjYt+0=
 github.com/ipfs/boxo v0.37.0 h1:2E3mZvydMI2t5IkAgtkmZ3sGsld0oS7o3I+xyzDk6uI=
@@ -619,17 +761,21 @@ github.com/jbenet/go-temp-err-catcher v0.1.0 h1:zpb3ZH6wIE8Shj2sKS+khgRvf7T7RABo
 github.com/jbenet/go-temp-err-catcher v0.1.0/go.mod h1:0kJRvmDZXNMIiJirNPEYfhpPwbGVtZVWC34vc5WLsDk=
 github.com/jeandeaual/go-locale v0.0.0-20250612000132-0ef82f21eade h1:FmusiCI1wHw+XQbvL9M+1r/C3SPqKrmBaIOYwVfQoDE=
 github.com/jeandeaual/go-locale v0.0.0-20250612000132-0ef82f21eade/go.mod h1:ZDXo8KHryOWSIqnsb/CiDq7hQUYryCgdVnxbj8tDG7o=
+github.com/jedisct1/go-minisign v0.0.0-20211028175153-1c139d1cc84b h1:ZGiXF8sz7PDk6RgkP+A/SFfUD0ZR/AgG6SpRNEDKZy8=
+github.com/jedisct1/go-minisign v0.0.0-20211028175153-1c139d1cc84b/go.mod h1:hQmNrgofl+IY/8L+n20H6E6PWBBTokdsv+q49j0QhsU=
+github.com/jellydator/ttlcache/v3 v3.4.0 h1:YS4P125qQS0tNhtL6aeYkheEaB/m8HCqdMMP4mnWdTY=
+github.com/jellydator/ttlcache/v3 v3.4.0/go.mod h1:Hw9EgjymziQD3yGsQdf1FqFdpp7YjFMd4Srg5EJlgD4=
 github.com/jessevdk/go-flags v1.4.0/go.mod h1:4FA24M0QyGHXBuZZK/XkWh8h0e1EYbRYJSGM75WSRxI=
 github.com/jinzhu/inflection v1.0.0 h1:K317FqzuhWc8YvSVlFMCCUb36O/S9MCKRDI7QkRKD/E=
 github.com/jinzhu/inflection v1.0.0/go.mod h1:h+uFLlag+Qp1Va5pdKtLDYj+kHp5pxUVkryuEj+Srlc=
 github.com/jinzhu/now v1.1.5 h1:/o9tlHleP7gOFmsnYNz3RGnqzefHA47wQpKrrdTIwXQ=
 github.com/jinzhu/now v1.1.5/go.mod h1:d3SSVoowX0Lcu0IBviAWJpolVfI5UJVZZ7cO71lE/z8=
+github.com/jmespath/go-jmespath v0.4.1-0.20220621161143-b0104c826a24 h1:liMMTbpW34dhU4az1GN0pTPADwNmvoRSeoZ6PItiqnY=
+github.com/jmespath/go-jmespath v0.4.1-0.20220621161143-b0104c826a24/go.mod h1:T8mJZnbsbmF+m6zOOFylbeCJqk5+pHWvzYPziyZiYoo=
 github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
 github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
 github.com/jolestar/go-commons-pool/v2 v2.1.2 h1:E+XGo58F23t7HtZiC/W6jzO2Ux2IccSH/yx4nD+J1CM=
 github.com/jolestar/go-commons-pool/v2 v2.1.2/go.mod h1:r4NYccrkS5UqP1YQI1COyTZ9UjPJAAGTUxzcsK1kqhY=
-github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
-github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
 github.com/joshdk/go-junit v1.0.0 h1:S86cUKIdwBHWwA6xCmFlf3RTLfVXYQfvanM5Uh+K6GE=
 github.com/joshdk/go-junit v1.0.0/go.mod h1:TiiV0PqkaNfFXjEiyjWM3XXrhVyCa1K4Zfga6W52ung=
 github.com/json-iterator/go v1.1.11/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4=
@@ -657,8 +803,9 @@ github.com/klauspost/compress v1.18.5/go.mod h1:cwPg85FWrGar70rWktvGQj8/hthj3wpl
 github.com/klauspost/cpuid v1.2.0/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
 github.com/klauspost/cpuid/v2 v2.3.0 h1:S4CRMLnYUhGeDFDqkGriYKdfoFlDnMtqTiI/sFzhA9Y=
 github.com/klauspost/cpuid/v2 v2.3.0/go.mod h1:hqwkgyIinND0mEev00jJYCxPNVRVXFQeu1XKlok6oO0=
-github.com/klauspost/pgzip v1.2.5 h1:qnWYvvKqedOF2ulHpMG72XQol4ILEJ8k2wwRl/Km8oE=
 github.com/klauspost/pgzip v1.2.5/go.mod h1:Ch1tH69qFZu15pkjo5kYi6mth2Zzwzt50oCQKQE9RUs=
+github.com/klauspost/pgzip v1.2.6 h1:8RXeL5crjEUFnR2/Sn6GJNWtSQ3Dk8pq4CL3jvdDyjU=
+github.com/klauspost/pgzip v1.2.6/go.mod h1:Ch1tH69qFZu15pkjo5kYi6mth2Zzwzt50oCQKQE9RUs=
 github.com/klippa-app/go-pdfium v1.19.2 h1:Gc/OT7wVO7xStNlDR5o/Qz0T/tsVtODsh7I1vOJXIKU=
 github.com/klippa-app/go-pdfium v1.19.2/go.mod h1:X+AMQDw/TXTsgiY2vEGA7oYlQTmjyqmlt6pm6aoIDa0=
 github.com/koron/go-ssdp v0.0.6 h1:Jb0h04599eq/CY7rB5YEqPS83HmRfHP2azkxMN2rFtU=
@@ -678,6 +825,8 @@ github.com/labstack/echo/v4 v4.15.1 h1:S9keusg26gZpjMmPqB5hOEvNKnmd1lNmcHrbbH2ln
 github.com/labstack/echo/v4 v4.15.1/go.mod h1:xmw1clThob0BSVRX1CRQkGQ/vjwcpOMjQZSZa9fKA/c=
 github.com/labstack/gommon v0.4.2 h1:F8qTUNXgG1+6WQmqoUWnz8WiEU60mXVVw0P4ht1WRA0=
 github.com/labstack/gommon v0.4.2/go.mod h1:QlUFxVM+SNXhDL/Z7YhocGIBYOiwB0mXm1+1bAPHPyU=
+github.com/letsencrypt/boulder v0.20251110.0 h1:J8MnKICeilO91dyQ2n5eBbab24neHzUpYMUIOdOtbjc=
+github.com/letsencrypt/boulder v0.20251110.0/go.mod h1:ogKCJQwll82m7OVHWyTuf8eeFCjuzdRQlgnZcCl0V+8=
 github.com/lib/pq v1.10.9 h1:YXG7RB+JIjhP29X+OtkiDnYaXQwpS4JEWq7dtCCRUEw=
 github.com/lib/pq v1.10.9/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o=
 github.com/libp2p/go-buffer-pool v0.1.0 h1:oK4mSFcQz7cTQIfqbe4MIj9gLW+mnanjyFtc6cdF0Y8=
@@ -721,8 +870,8 @@ github.com/lufia/plan9stats v0.0.0-20250317134145-8bc96cf8fc35/go.mod h1:autxFIv
 github.com/magiconair/properties v1.8.5/go.mod h1:y3VJvCyxH9uVvJTWEGAELF3aiYNyPKd5NZ3oSwXrF60=
 github.com/magiconair/properties v1.8.10 h1:s31yESBquKXCV9a/ScB3ESkOjUYYv+X0rg8SYxI99mE=
 github.com/magiconair/properties v1.8.10/go.mod h1:Dhd985XPs7jluiymwWYZ0G4Z61jb3vdS329zhj2hYo0=
-github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0=
-github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
+github.com/mailru/easyjson v0.9.0 h1:PrnmzHw7262yW8sTBwxi1PdJA3Iw/EKBa8psRf7d9a4=
+github.com/mailru/easyjson v0.9.0/go.mod h1:1+xMtQp2MRNVL/V1bOzuP3aP8VNwRW55fQUto+XFtTU=
 github.com/marcopolo/simnet v0.0.4 h1:50Kx4hS9kFGSRIbrt9xUS3NJX33EyPqHVmpXvaKLqrY=
 github.com/marcopolo/simnet v0.0.4/go.mod h1:tfQF1u2DmaB6WHODMtQaLtClEf3a296CKQLq5gAsIS0=
 github.com/marten-seemann/tcp v0.0.0-20210406111302-dfbc87cc63fd h1:br0buuQ854V8u83wA0rVZ8ttrq5CpaPZdvrK0LP2lOk=
@@ -742,8 +891,8 @@ github.com/mattn/go-runewidth v0.0.9/go.mod h1:H031xJmbD/WCDINGzjvQ9THkh0rPKHF+m
 github.com/mattn/go-runewidth v0.0.12/go.mod h1:RAqKPSqVFrSLVXbA8x7dzmKdmGzieGRCM46jaSJTDAk=
 github.com/mattn/go-runewidth v0.0.17 h1:78v8ZlW0bP43XfmAfPsdXcoNCelfMHsDmd/pkENfrjQ=
 github.com/mattn/go-runewidth v0.0.17/go.mod h1:Jdepj2loyihRzMpdS35Xk/zdY8IAYHsh153qUoGf23w=
-github.com/mattn/go-sqlite3 v1.14.24 h1:tpSp2G2KyMnnQu99ngJ47EIkWVmliIizyZBfPrBWDRM=
-github.com/mattn/go-sqlite3 v1.14.24/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
+github.com/mattn/go-sqlite3 v1.14.28 h1:ThEiQrnbtumT+QMknw63Befp/ce/nUPgBPMlRFEum7A=
+github.com/mattn/go-sqlite3 v1.14.28/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
 github.com/mdelapenya/tlscert v0.2.0 h1:7H81W6Z/4weDvZBNOfQte5GpIMo0lGYEeWbkGp5LJHI=
 github.com/mdelapenya/tlscert v0.2.0/go.mod h1:O4njj3ELLnJjGdkN7M/vIVCpZ+Cf0L6muqOG4tLSl8o=
 github.com/mfridman/tparse v0.18.0 h1:wh6dzOKaIwkUGyKgOntDW4liXSo37qg5AXbIhkMV3vE=
@@ -780,6 +929,8 @@ github.com/mitchellh/iochan v1.0.0/go.mod h1:JwYml1nuB7xOzsp52dPpHFffvOCDupsG0Qu
 github.com/mitchellh/mapstructure v0.0.0-20160808181253-ca63d7c062ee/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y=
 github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y=
 github.com/mitchellh/mapstructure v1.4.1/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo=
+github.com/mitchellh/mapstructure v1.5.0 h1:jeMsZIYE/09sWLaz43PL7Gy6RuMjD2eJVyuac5Z2hdY=
+github.com/mitchellh/mapstructure v1.5.0/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo=
 github.com/mitchellh/reflectwalk v1.0.2 h1:G2LzWKi524PWgd3mLHV8Y5k7s6XUvT0Gef6zxSIeXaQ=
 github.com/mitchellh/reflectwalk v1.0.2/go.mod h1:mSTlrgnPZtwu0c4WaC2kGObEpuNDbx0jmZXqmk4esnw=
 github.com/moby/docker-image-spec v1.3.1 h1:jMKff3w6PgbfSa69GfNg+zN/XLhfXJGnEx3Nl2EsFP0=
@@ -865,6 +1016,8 @@ github.com/multiformats/go-varint v0.1.0 h1:i2wqFp4sdl3IcIxfAonHQV9qU5OsZ4Ts9IOo
 github.com/multiformats/go-varint v0.1.0/go.mod h1:5KVAVXegtfmNQQm/lCY+ATvDzvJJhSkUlGQV9wgObdI=
 github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
 github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
+github.com/natefinch/atomic v1.0.1 h1:ZPYKxkqQOx3KZ+RsbnP/YsgvxWQPGxjC0oBt2AhwV0A=
+github.com/natefinch/atomic v1.0.1/go.mod h1:N/D/ELrljoqDyT3rZrsUmtsuzvHkeB/wWjHV22AZRbM=
 github.com/nats-io/nats.go v1.50.0 h1:5zAeQrTvyrKrWLJ0fu02W3br8ym57qf7csDzgLOpcds=
 github.com/nats-io/nats.go v1.50.0/go.mod h1:26HypzazeOkyO3/mqd1zZd53STJN0EjCYF9Uy2ZOBno=
 github.com/nats-io/nkeys v0.4.15 h1:JACV5jRVO9V856KOapQ7x+EY8Jo3qw1vJt/9Jpwzkk4=
@@ -881,6 +1034,8 @@ github.com/nwaples/rardecode v1.1.0 h1:vSxaY8vQhOcVr4mm5e8XllHWTiM4JF507A0Katqw7
 github.com/nwaples/rardecode v1.1.0/go.mod h1:5DzqNKiOdpKKBH87u8VlvAnPZMXcGRhxWkRpHbbfGS0=
 github.com/nxadm/tail v1.4.8 h1:nPr65rt6Y5JFSKQO7qToXr7pePgD6Gwiw05lkbyAQTE=
 github.com/nxadm/tail v1.4.8/go.mod h1:+ncqLTQzXmGhMZNUePPaPqPvBxHAIsmXswZKocGu+AU=
+github.com/oklog/ulid v1.3.1 h1:EGfNDEx6MqHz8B3uNV6QAib1UR2Lm97sHi3ocA6ESJ4=
+github.com/oklog/ulid v1.3.1/go.mod h1:CirwcVhetQ6Lv90oh/F+FBtV6XMibvdAFo93nm5qn4U=
 github.com/olekukonko/tablewriter v0.0.5 h1:P2Ga83D34wi1o9J6Wh1mRuqd4mF/x/lgBS7N7AbDhec=
 github.com/olekukonko/tablewriter v0.0.5/go.mod h1:hPp6KlRPjbx+hW8ykQs1w3UBbZlj6HuIJcUGPhkA7kY=
 github.com/ollama/ollama v0.20.4 h1:XXquZkzAptOoAzNHAyKQOhiShoDFMfn3Yp56C7Vfsjs=
@@ -957,6 +1112,8 @@ github.com/pion/webrtc/v4 v4.2.11 h1:QUX1QZKlNIn4O7U5JxLPGP0sV5RTncZkzu9SPR3jVNU
 github.com/pion/webrtc/v4 v4.2.11/go.mod h1:s/rAiyy77GyRFrZMx+Ls6aua26dIBPudH8/ZHYbIRWY=
 github.com/pjbgf/sha1cd v0.6.0 h1:3WJ8Wz8gvDz29quX1OcEmkAlUg9diU4GxJHqs0/XiwU=
 github.com/pjbgf/sha1cd v0.6.0/go.mod h1:lhpGlyHLpQZoxMv8HcgXvZEhcGs0PG/vsZnEJ7H0iCM=
+github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c h1:+mdjkGKdHQG3305AYmdv1U2eRNDiU2ErMBj1gwrq8eQ=
+github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c/go.mod h1:7rwL4CYBLnjLxUqIJNnCWiEdr3bn6IUYi15bNlnbCCU=
 github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
 github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
 github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
@@ -1008,23 +1165,33 @@ github.com/russross/blackfriday v1.6.0/go.mod h1:ti0ldHuxg49ri4ksnFxlkCfN+hvslNl
 github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
 github.com/ruudk/golang-pdf417 v0.0.0-20181029194003-1af4ab5afa58/go.mod h1:6lfFZQK844Gfx8o5WFuvpxWRwnSoipWe/p622j1v06w=
 github.com/ryanuber/columnize v0.0.0-20160712163229-9b3edd62028f/go.mod h1:sm1tb6uqfes/u+d4ooFouqFdy9/2g9QGwK3SQygK0Ts=
+github.com/ryanuber/go-glob v1.0.0 h1:iQh3xXAumdQ+4Ufa5b25cRpC5TYKlno6hsv6Cb3pkBk=
+github.com/ryanuber/go-glob v1.0.0/go.mod h1:807d1WSdnB0XRJzKNil9Om6lcp/3a0v4qIHxIXzX/Yc=
 github.com/rymdport/portal v0.4.2 h1:7jKRSemwlTyVHHrTGgQg7gmNPJs88xkbKcIL3NlcmSU=
 github.com/rymdport/portal v0.4.2/go.mod h1:kFF4jslnJ8pD5uCi17brj/ODlfIidOxlgUDTO5ncnC4=
 github.com/saintfish/chardet v0.0.0-20230101081208-5e3ef4b5456d h1:hrujxIzL1woJ7AwssoOcM/tq5JjjG2yYOc8odClEiXA=
 github.com/saintfish/chardet v0.0.0-20230101081208-5e3ef4b5456d/go.mod h1:uugorj2VCxiV1x+LzaIdVa9b4S4qGAcH6cbhh4qVxOU=
 github.com/sashabaranov/go-openai v1.41.2 h1:vfPRBZNMpnqu8ELsclWcAvF19lDNgh1t6TVfFFOPiSM=
 github.com/sashabaranov/go-openai v1.41.2/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=
+github.com/sassoftware/relic v7.2.1+incompatible h1:Pwyh1F3I0r4clFJXkSI8bOyJINGqpgjJU3DYAZeI05A=
+github.com/sassoftware/relic v7.2.1+incompatible/go.mod h1:CWfAxv73/iLZ17rbyhIEq3K9hs5w6FpNMdUT//qR+zk=
+github.com/sassoftware/relic/v7 v7.6.2 h1:rS44Lbv9G9eXsukknS4mSjIAuuX+lMq/FnStgmZlUv4=
+github.com/sassoftware/relic/v7 v7.6.2/go.mod h1:kjmP0IBVkJZ6gXeAu35/KCEfca//+PKM6vTAsyDPY+k=
 github.com/schollz/progressbar/v3 v3.19.0 h1:Ea18xuIRQXLAUidVDox3AbwfUhD0/1IvohyTutOIFoc=
 github.com/schollz/progressbar/v3 v3.19.0/go.mod h1:IsO3lpbaGuzh8zIMzgY3+J8l4C8GjO0Y9S69eFvNsec=
 github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529/go.mod h1:DxrIzT+xaE7yg65j358z/aeFdxmN0P9QXhEzd20vsDc=
 github.com/sebdah/goldie/v2 v2.7.1 h1:PkBHymaYdtvEkZV7TmyqKxdmn5/Vcj+8TpATWZjnG5E=
 github.com/sebdah/goldie/v2 v2.7.1/go.mod h1:oZ9fp0+se1eapSRjfYbsV/0Hqhbuu3bJVvKI/NNtssI=
+github.com/secure-systems-lab/go-securesystemslib v0.9.1 h1:nZZaNz4DiERIQguNy0cL5qTdn9lR8XKHf4RUyG1Sx3g=
+github.com/secure-systems-lab/go-securesystemslib v0.9.1/go.mod h1:np53YzT0zXGMv6x4iEWc9Z59uR+x+ndLwCLqPYpLXVU=
 github.com/segmentio/asm v1.1.3 h1:WM03sfUOENvvKexOLp+pCqgb/WDjsi7EK8gIsICtzhc=
 github.com/segmentio/asm v1.1.3/go.mod h1:Ld3L4ZXGNcSLRg4JBsZ3//1+f/TjYl0Mzen/DQy1EJg=
 github.com/segmentio/encoding v0.5.4 h1:OW1VRern8Nw6ITAtwSZ7Idrl3MXCFwXHPgqESYfvNt0=
 github.com/segmentio/encoding v0.5.4/go.mod h1:HS1ZKa3kSN32ZHVZ7ZLPLXWvOVIiZtyJnO1gPH1sKt0=
 github.com/sergi/go-diff v1.4.0 h1:n/SP9D5ad1fORl+llWyN+D6qoUETXNZARKjyY2/KVCw=
 github.com/sergi/go-diff v1.4.0/go.mod h1:A0bzQcvG0E7Rwjx0REVgAGH58e96+X0MeOfepqsbeW4=
+github.com/shibumi/go-pathspec v1.3.0 h1:QUyMZhFo0Md5B8zV8x2tesohbb5kfbpTi9rBnKh5dkI=
+github.com/shibumi/go-pathspec v1.3.0/go.mod h1:Xutfslp817l2I1cZvgcfeMQJG5QnU2lh5tVaaMCl3jE=
 github.com/shirou/gopsutil/v3 v3.24.5 h1:i0t8kL+kQTvpAYToeuiVk3TgDeKOFioZO3Ztz/iZ9pI=
 github.com/shirou/gopsutil/v3 v3.24.5/go.mod h1:bsoOS1aStSs9ErQ1WWfxllSeS1K5D+U30r2NfcubMVk=
 github.com/shirou/gopsutil/v4 v4.26.3 h1:2ESdQt90yU3oXF/CdOlRCJxrP+Am1aBYubTMTfxJ1qc=
@@ -1039,6 +1206,26 @@ github.com/shurcooL/go v0.0.0-20200502201357-93f07166e636/go.mod h1:TDJrrUr11Vxr
 github.com/shurcooL/httpfs v0.0.0-20190707220628-8d4bc4ba7749/go.mod h1:ZY1cvUeJuFPAdZ/B6v7RHavJWZn2YPVFQ1OSXhCGOkg=
 github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc=
 github.com/shurcooL/vfsgen v0.0.0-20200824052919-0d455de96546/go.mod h1:TrYk7fJVaAttu97ZZKrO9UbRa8izdowaMIZcxYMbVaw=
+github.com/sigstore/protobuf-specs v0.5.1 h1:/5OPaNuolRJmQfeZLayJGFXMpsRJEdgC6ah1/+7Px7U=
+github.com/sigstore/protobuf-specs v0.5.1/go.mod h1:DRBzpFuE+LnvQMN10/dU6nBeKwVLGEQ6o2FovN2Rats=
+github.com/sigstore/rekor v1.4.3 h1:2+aw4Gbgumv8vYM/QVg6b+hvr4x4Cukur8stJrVPKU0=
+github.com/sigstore/rekor v1.4.3/go.mod h1:o0zgY087Q21YwohVvGwV9vK1/tliat5mfnPiVI3i75o=
+github.com/sigstore/rekor-tiles/v2 v2.0.1 h1:1Wfz15oSRNGF5Dzb0lWn5W8+lfO50ork4PGIfEKjZeo=
+github.com/sigstore/rekor-tiles/v2 v2.0.1/go.mod h1:Pjsbhzj5hc3MKY8FfVTYHBUHQEnP0ozC4huatu4x7OU=
+github.com/sigstore/sigstore v1.10.0 h1:lQrmdzqlR8p9SCfWIpFoGUqdXEzJSZT2X+lTXOMPaQI=
+github.com/sigstore/sigstore v1.10.0/go.mod h1:Ygq+L/y9Bm3YnjpJTlQrOk/gXyrjkpn3/AEJpmk1n9Y=
+github.com/sigstore/sigstore-go v1.1.4 h1:wTTsgCHOfqiEzVyBYA6mDczGtBkN7cM8mPpjJj5QvMg=
+github.com/sigstore/sigstore-go v1.1.4/go.mod h1:2U/mQOT9cjjxrtIUeKDVhL+sHBKsnWddn8URlswdBsg=
+github.com/sigstore/sigstore/pkg/signature/kms/aws v1.10.0 h1:UOHpiyezCj5RuixgIvCV3QyuxIGQT+N6nGZEXA7OTTY=
+github.com/sigstore/sigstore/pkg/signature/kms/aws v1.10.0/go.mod h1:U0CZmA2psabDa8DdiV7yXab0AHODzfKqvD2isH7Hrvw=
+github.com/sigstore/sigstore/pkg/signature/kms/azure v1.10.0 h1:fq4+8Y4YadxeF8mzhoMRPZ1mVvDYXmI3BfS0vlkPT7M=
+github.com/sigstore/sigstore/pkg/signature/kms/azure v1.10.0/go.mod h1:u05nqPWY05lmcdHhv2lPaWTH3FGUhJzO7iW2hbboK3Q=
+github.com/sigstore/sigstore/pkg/signature/kms/gcp v1.10.0 h1:iUEf5MZYOuXGnXxdF/WrarJrk0DTVHqeIOjYdtpVXtc=
+github.com/sigstore/sigstore/pkg/signature/kms/gcp v1.10.0/go.mod h1:i6vg5JfEQix46R1rhQlrKmUtJoeH91drltyYOJEk1T4=
+github.com/sigstore/sigstore/pkg/signature/kms/hashivault v1.10.0 h1:dUvPv/MP23ZPIXZUW45kvCIgC0ZRfYxEof57AB6bAtU=
+github.com/sigstore/sigstore/pkg/signature/kms/hashivault v1.10.0/go.mod h1:fR/gDdPvJWGWL70/NgBBIL1O0/3Wma6JHs3tSSYg3s4=
+github.com/sigstore/timestamp-authority/v2 v2.0.3 h1:sRyYNtdED/ttLCMdaYnwpf0zre1A9chvjTnCmWWxN8Y=
+github.com/sigstore/timestamp-authority/v2 v2.0.3/go.mod h1:mDaHxkt3HmZYoIlwYj4QWo0RUr7VjYU52aVO5f5Qb3I=
 github.com/sirupsen/logrus v1.7.0/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
 github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
 github.com/sirupsen/logrus v1.9.4 h1:TsZE7l11zFCLZnZ+teH4Umoq5BhEIfIzfRDZ1Uzql2w=
@@ -1061,11 +1248,15 @@ github.com/spaolacci/murmur3 v1.1.0 h1:7c1g84S4BPRrfL5Xrdp6fOJ206sU9y293DDHaoy0b
 github.com/spaolacci/murmur3 v1.1.0/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
 github.com/spf13/afero v1.6.0/go.mod h1:Ai8FlHk4v/PARR026UzYexafAt9roJ7LcLMAmO6Z93I=
 github.com/spf13/cast v1.3.1/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkUJE=
-github.com/spf13/cast v1.7.0 h1:ntdiHjuueXFgm5nzDRdOS4yfT43P5Fnud6DH50rz/7w=
-github.com/spf13/cast v1.7.0/go.mod h1:ancEpBxwJDODSW/UG4rDrAqiKolqNNh2DX3mk86cAdo=
+github.com/spf13/cast v1.10.0 h1:h2x0u2shc1QuLHfxi+cTJvs30+ZAHOGRic8uyGTDWxY=
+github.com/spf13/cast v1.10.0/go.mod h1:jNfB8QC9IA6ZuY2ZjDp0KtFO2LZZlg4S/7bzP6qqeHo=
 github.com/spf13/cobra v1.2.1/go.mod h1:ExllRjgxM/piMAM+3tAZvg8fsklGAf3tPfi+i8t68Nk=
+github.com/spf13/cobra v1.10.2 h1:DMTTonx5m65Ic0GOoRY2c16WCbHxOOw6xxezuLaBpcU=
+github.com/spf13/cobra v1.10.2/go.mod h1:7C1pvHqHw5A4vrJfjNwvOdzYu0Gml16OCs2GRiTUUS4=
 github.com/spf13/jwalterweatherman v1.1.0/go.mod h1:aNWZUN0dPAAO/Ljvb5BEdw96iTZ0EXowPYD95IqWIGo=
 github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
+github.com/spf13/pflag v1.0.10 h1:4EBh2KAYBwaONj6b2Ye1GiHfwjqyROoF4RwYO+vPwFk=
+github.com/spf13/pflag v1.0.10/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
 github.com/spf13/viper v1.8.1/go.mod h1:o0Pch8wJ9BVSWGQMbra6iw0oQ5oktSIBaujf1rJH9Ns=
 github.com/srwiley/oksvg v0.0.0-20221011165216-be6e8873101c h1:km8GpoQut05eY3GiYWEedbTT0qnSxrCjsVbb7yKY1KE=
 github.com/srwiley/oksvg v0.0.0-20221011165216-be6e8873101c/go.mod h1:cNQ3dwVJtS5Hmnjxy6AgTPd0Inb3pW05ftPSX7NZO7Q=
@@ -1114,6 +1305,10 @@ github.com/testcontainers/testcontainers-go/modules/postgres v0.42.0 h1:GCbb1ndr
 github.com/testcontainers/testcontainers-go/modules/postgres v0.42.0/go.mod h1:IRPBaI8jXdrNfD0e4Zm7Fbcgaz5shKxOQv4axiL09xs=
 github.com/tetratelabs/wazero v1.11.0 h1:+gKemEuKCTevU4d7ZTzlsvgd1uaToIDtlQlmNbwqYhA=
 github.com/tetratelabs/wazero v1.11.0/go.mod h1:eV28rsN8Q+xwjogd7f4/Pp4xFxO7uOGbLcD/LzB1wiU=
+github.com/theupdateframework/go-tuf v0.7.0 h1:CqbQFrWo1ae3/I0UCblSbczevCCbS31Qvs5LdxRWqRI=
+github.com/theupdateframework/go-tuf v0.7.0/go.mod h1:uEB7WSY+7ZIugK6R1hiBMBjQftaFzn7ZCDJcp1tCUug=
+github.com/theupdateframework/go-tuf/v2 v2.3.0 h1:gt3X8xT8qu/HT4w+n1jgv+p7koi5ad8XEkLXXZqG9AA=
+github.com/theupdateframework/go-tuf/v2 v2.3.0/go.mod h1:xW8yNvgXRncmovMLvBxKwrKpsOwJZu/8x+aB0KtFcdw=
 github.com/thoj/go-ircevent v0.0.0-20210723090443-73e444401d64 h1:l/T7dYuJEQZOwVOpjIXr1180aM9PZL/d1MnMVIxefX4=
 github.com/thoj/go-ircevent v0.0.0-20210723090443-73e444401d64/go.mod h1:Q1NAJOuRdQCqN/VIWdnaaEhV8LpeO2rtlBP7/iDJNII=
 github.com/tidwall/gjson v1.14.2/go.mod h1:/wbyibRr2FHMks5tjHJ5F8dMZh3AcwJEMf5vlfC0lxk=
@@ -1129,6 +1324,16 @@ github.com/tidwall/sjson v1.2.5 h1:kLy8mja+1c9jlljvWTlSazM7cKDRfJuR/bOJhcY5NcY=
 github.com/tidwall/sjson v1.2.5/go.mod h1:Fvgq9kS/6ociJEDnK0Fk1cpYF4FIW6ZF7LAe+6jwd28=
 github.com/timbutler/zxcvbn v1.0.4 h1:nTUa8UpLhIxhUBag42fQcwiC8AtTxNVbQMbmxyxLfXg=
 github.com/timbutler/zxcvbn v1.0.4/go.mod h1:Cl20mGFz9+SXvTRebBcwMUDqZUvCfSnb+XMznbTKo2U=
+github.com/tink-crypto/tink-go-awskms/v2 v2.1.0 h1:N9UxlsOzu5mttdjhxkDLbzwtEecuXmlxZVo/ds7JKJI=
+github.com/tink-crypto/tink-go-awskms/v2 v2.1.0/go.mod h1:PxSp9GlOkKL9rlybW804uspnHuO9nbD98V/fDX4uSis=
+github.com/tink-crypto/tink-go-gcpkms/v2 v2.2.0 h1:3B9i6XBXNTRspfkTC0asN5W0K6GhOSgcujNiECNRNb0=
+github.com/tink-crypto/tink-go-gcpkms/v2 v2.2.0/go.mod h1:jY5YN2BqD/KSCHM9SqZPIpJNG/u3zwfLXHgws4x2IRw=
+github.com/tink-crypto/tink-go-hcvault/v2 v2.3.0 h1:6nAX1aRGnkg2SEUMwO5toB2tQkP0Jd6cbmZ/K5Le1V0=
+github.com/tink-crypto/tink-go-hcvault/v2 v2.3.0/go.mod h1:HOC5NWW1wBI2Vke1FGcRBvDATkEYE7AUDiYbXqi2sBw=
+github.com/tink-crypto/tink-go/v2 v2.5.0 h1:B8KLF6AofxdBIE4UJIaFbmoj5/1ehEtt7/MmzfI4Zpw=
+github.com/tink-crypto/tink-go/v2 v2.5.0/go.mod h1:2WbBA6pfNsAfBwDCggboaHeB2X29wkU8XHtGwh2YIk8=
+github.com/titanous/rocacheck v0.0.0-20171023193734-afe73141d399 h1:e/5i7d4oYZ+C1wj2THlRK+oAhjeS/TRQwMfkIuet3w0=
+github.com/titanous/rocacheck v0.0.0-20171023193734-afe73141d399/go.mod h1:LdwHTNJT99C5fTAzDz0ud328OgXz+gierycbcIx2fRs=
 github.com/tklauser/go-sysconf v0.3.16 h1:frioLaCQSsF5Cy1jgRBrzr6t502KIIwQ0MArYICU0nA=
 github.com/tklauser/go-sysconf v0.3.16/go.mod h1:/qNL9xxDhc7tx3HSRsLWNnuzbVfh3e7gh/BmM179nYI=
 github.com/tklauser/numcpus v0.11.0 h1:nSTwhKH5e1dMNsCdVBukSZrURJRoHbSEQjdEbY+9RXw=
@@ -1137,6 +1342,10 @@ github.com/tmc/langchaingo v0.1.14 h1:o1qWBPigAIuFvrG6cjTFo0cZPFEZ47ZqpOYMjM15yZ
 github.com/tmc/langchaingo v0.1.14/go.mod h1:aKKYXYoqhIDEv7WKdpnnCLRaqXic69cX9MnDUk72378=
 github.com/traefik/yaegi v0.16.1 h1:f1De3DVJqIDKmnasUF6MwmWv1dSEEat0wcpXhD2On3E=
 github.com/traefik/yaegi v0.16.1/go.mod h1:4eVhbPb3LnD2VigQjhYbEJ69vDRFdT2HQNrXx8eEwUY=
+github.com/transparency-dev/formats v0.0.0-20251017110053-404c0d5b696c h1:5a2XDQ2LiAUV+/RjckMyq9sXudfrPSuCY4FuPC1NyAw=
+github.com/transparency-dev/formats v0.0.0-20251017110053-404c0d5b696c/go.mod h1:g85IafeFJZLxlzZCDRu4JLpfS7HKzR+Hw9qRh3bVzDI=
+github.com/transparency-dev/merkle v0.0.2 h1:Q9nBoQcZcgPamMkGn7ghV8XiTZ/kRxn1yCG81+twTK4=
+github.com/transparency-dev/merkle v0.0.2/go.mod h1:pqSy+OXefQ1EDUVmAJ8MUhHB9TXGuzVAT58PqBoHz1A=
 github.com/ulikunitz/xz v0.5.8/go.mod h1:nbz6k7qbPmH4IRqmfOplQw/tblSgqTqBwxkY0oWt/14=
 github.com/ulikunitz/xz v0.5.9/go.mod h1:nbz6k7qbPmH4IRqmfOplQw/tblSgqTqBwxkY0oWt/14=
 github.com/ulikunitz/xz v0.5.14 h1:uv/0Bq533iFdnMHZdRBTOlaNMdb1+ZxXIlHDZHIHcvg=
@@ -1184,6 +1393,8 @@ github.com/yuin/goldmark-emoji v1.0.6 h1:QWfF2FYaXwL74tfGOW5izeiZepUDroDJfWubQI9
 github.com/yuin/goldmark-emoji v1.0.6/go.mod h1:ukxJDKFpdFb5x0a5HqbdlcKtebh086iJpI31LTKmWuA=
 github.com/yusufpapurcu/wmi v1.2.4 h1:zFUKzehAFReQwLys1b/iSMl+JQGSCSjtVqQn9bBrPo0=
 github.com/yusufpapurcu/wmi v1.2.4/go.mod h1:SBZ9tNy3G9/m5Oi98Zks0QjeHVDvuK0qfxQmPyzfmi0=
+github.com/zalando/go-keyring v0.2.3 h1:v9CUu9phlABObO4LPWycf+zwMG7nlbb3t/B5wa97yms=
+github.com/zalando/go-keyring v0.2.3/go.mod h1:HL4k+OXQfJUWaMnqyuSOc0drfGPX2b51Du6K+MRgZMk=
 go.etcd.io/bbolt v1.4.3 h1:dEadXpI6G79deX5prL3QRNP6JB8UxVkqo4UPnHaNXJo=
 go.etcd.io/bbolt v1.4.3/go.mod h1:tKQlpPaYCVFctUIgFKFnAlvbmB3tpy1vkTnDWohtc0E=
 go.etcd.io/etcd/api/v3 v3.5.0/go.mod h1:cbVKeC6lCfl7j/8jBhAK6aIYO9XOjdptoxU/nLQcPvs=
@@ -1191,6 +1402,8 @@ go.etcd.io/etcd/client/pkg/v3 v3.5.0/go.mod h1:IJHfcCEKxYu1Os13ZdwCwIUTUVGYTSAM3
 go.etcd.io/etcd/client/v2 v2.305.0/go.mod h1:h9puh54ZTgAKtEbut2oe9P4L/oqKCVB6xsXlzd7alYQ=
 go.mau.fi/util v0.3.0 h1:Lt3lbRXP6ZBqTINK0EieRWor3zEwwwrDT14Z5N8RUCs=
 go.mau.fi/util v0.3.0/go.mod h1:9dGsBCCbZJstx16YgnVMVi3O2bOizELoKpugLD4FoGs=
+go.mongodb.org/mongo-driver v1.17.6 h1:87JUG1wZfWsr6rIz3ZmpH90rL5tea7O3IHuSwHUpsss=
+go.mongodb.org/mongo-driver v1.17.6/go.mod h1:Hy04i7O2kC4RS06ZrhPRqj/u4DTYkFDAAccj+rVKqgQ=
 go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU=
 go.opencensus.io v0.22.0/go.mod h1:+kGneAE2xo2IficOXnaByMWTGM9T73dGwxeWcUqIpI8=
 go.opencensus.io v0.22.2/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw=
@@ -1202,6 +1415,8 @@ go.opencensus.io v0.24.0 h1:y73uSU6J157QMP2kn2r30vwW1A2W2WFwSCGnAVxeaD0=
 go.opencensus.io v0.24.0/go.mod h1:vNK8G9p7aAivkbmorf4v+7Hgx+Zs0yY+0fOtgBfjQKo=
 go.opentelemetry.io/auto/sdk v1.2.1 h1:jXsnJ4Lmnqd11kwkBV2LgLoFMZKizbCi5fNZ/ipaZ64=
 go.opentelemetry.io/auto/sdk v1.2.1/go.mod h1:KRTj+aOaElaLi+wW1kO/DZRXwkF4C5xPbEe3ZiIhN7Y=
+go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.63.0 h1:YH4g8lQroajqUwWbq/tr2QX1JFmEXaDLgG+ew9bLMWo=
+go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.63.0/go.mod h1:fvPi2qXDqFs8M4B4fmJhE92TyQs9Ydjlg3RvfUp+NbQ=
 go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.65.0 h1:7iP2uCb7sGddAr30RRS6xjKy7AZ2JtTOPA3oolgVSw8=
 go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.65.0/go.mod h1:c7hN3ddxs/z6q9xwvfLPk+UHlWRQyaeR1LdgfL/66l0=
 go.opentelemetry.io/otel v1.43.0 h1:mYIM03dnh5zfN7HautFE4ieIig9amkNANT+xcVxAj9I=
@@ -1218,6 +1433,8 @@ go.opentelemetry.io/otel/trace v1.43.0 h1:BkNrHpup+4k4w+ZZ86CZoHHEkohws8AY+WTX09
 go.opentelemetry.io/otel/trace v1.43.0/go.mod h1:/QJhyVBUUswCphDVxq+8mld+AvhXZLhe+8WVFxiFff0=
 go.starlark.net v0.0.0-20250417143717-f57e51f710eb h1:zOg9DxxrorEmgGUr5UPdCEwKqiqG0MlZciuCuA3XiDE=
 go.starlark.net v0.0.0-20250417143717-f57e51f710eb/go.mod h1:YKMCv9b1WrfWmeqdV5MAuEHWsu5iC+fe6kYl2sQjdI8=
+go.step.sm/crypto v0.74.0 h1:/APBEv45yYR4qQFg47HA8w1nesIGcxh44pGyQNw6JRA=
+go.step.sm/crypto v0.74.0/go.mod h1:UoXqCAJjjRgzPte0Llaqen7O9P7XjPmgjgTHQGkKCDk=
 go.uber.org/atomic v1.6.0/go.mod h1:sABNBOSYdrvTF6hTgEIbc7YasKWGhgEQZyfxyTvoXHQ=
 go.uber.org/atomic v1.7.0/go.mod h1:fEN4uk6kAWBTFdckzkM89CLk9XfWZrxpCo0nPH17wJc=
 go.uber.org/dig v1.19.0 h1:BACLhebsYdpQ7IROQ1AGPjrXcP5dF80U3gKoFzbaq/4=
@@ -1594,6 +1811,8 @@ google.golang.org/api v0.40.0/go.mod h1:fYKFpnQN0DsDSKRVRcQSDQNtqWPfM9i+zNPxepjR
 google.golang.org/api v0.41.0/go.mod h1:RkxM5lITDfTzmyKFPt+wGrCJbVfniCr2ool8kTBzRTU=
 google.golang.org/api v0.43.0/go.mod h1:nQsDGjRXMo4lvh5hP0TKqF244gqhGcr/YSIykhUk/94=
 google.golang.org/api v0.44.0/go.mod h1:EBOGZqzyhtvMDoxwS97ctnh0zUmYY6CxqXsc1AvkYD8=
+google.golang.org/api v0.256.0 h1:u6Khm8+F9sxbCTYNoBHg6/Hwv0N/i+V94MvkOSor6oI=
+google.golang.org/api v0.256.0/go.mod h1:KIgPhksXADEKJlnEoRa9qAII4rXcy40vfI8HRqcU964=
 google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM=
 google.golang.org/appengine v1.4.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4=
 google.golang.org/appengine v1.5.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4=
@@ -1644,6 +1863,10 @@ google.golang.org/genproto v0.0.0-20210310155132-4ce2db91004e/go.mod h1:FWY/as6D
 google.golang.org/genproto v0.0.0-20210319143718-93e7006c17a6/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no=
 google.golang.org/genproto v0.0.0-20210402141018-6c239bbf2bb1/go.mod h1:9lPAdzaEmUacj36I+k7YKbEc5CXzPIeORRgDAUOu28A=
 google.golang.org/genproto v0.0.0-20210602131652-f16073e35f0c/go.mod h1:UODoCrxHCcBojKKwX1terBiRUaqAsFqJiF615XL43r0=
+google.golang.org/genproto v0.0.0-20250922171735-9219d122eba9 h1:LvZVVaPE0JSqL+ZWb6ErZfnEOKIqqFWUJE2D0fObSmc=
+google.golang.org/genproto v0.0.0-20250922171735-9219d122eba9/go.mod h1:QFOrLhdAe2PsTp3vQY4quuLKTi9j3XG3r6JPPaw7MSc=
+google.golang.org/genproto/googleapis/api v0.0.0-20260128011058-8636f8732409 h1:merA0rdPeUV3YIIfHHcH4qBkiQAc1nfCKSI7lB4cV2M=
+google.golang.org/genproto/googleapis/api v0.0.0-20260128011058-8636f8732409/go.mod h1:fl8J1IvUjCilwZzQowmw2b7HQB2eAuYBabMXzWurF+I=
 google.golang.org/genproto/googleapis/rpc v0.0.0-20260128011058-8636f8732409 h1:H86B94AW+VfJWDqFeEbBPhEtHzJwJfTbgE2lZa54ZAQ=
 google.golang.org/genproto/googleapis/rpc v0.0.0-20260128011058-8636f8732409/go.mod h1:j9x/tPzZkyxcgEFkiKEEGxfvyumM01BEtsW8xzOahRQ=
 google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c=
@@ -1726,6 +1949,8 @@ howett.net/plist v1.0.2-0.20250314012144-ee69052608d9 h1:eeH1AIcPvSc0Z25ThsYF+Xo
 howett.net/plist v1.0.2-0.20250314012144-ee69052608d9/go.mod h1:fyFX5Hj5tP1Mpk8obqA9MZgXT416Q5711SDT7dQLTLk=
 jaytaylor.com/html2text v0.0.0-20230321000545-74c2419ad056 h1:6YFJoB+0fUH6X3xU/G2tQqCYg+PkGtnZ5nMR5rpw72g=
 jaytaylor.com/html2text v0.0.0-20230321000545-74c2419ad056/go.mod h1:OxvTsCwKosqQ1q7B+8FwXqg4rKZ/UG9dUW+g/VL2xH4=
+k8s.io/klog/v2 v2.130.1 h1:n9Xl7H1Xvksem4KFG4PYbdQCQxqc/tTUyrgXaOhHSzk=
+k8s.io/klog/v2 v2.130.1/go.mod h1:3Jpz1GvMt720eyJH1ckRHK1EDfpxISzJ7I9OYgaDtPE=
 lukechampine.com/blake3 v1.4.1 h1:I3Smz7gso8w4/TunLKec6K2fn+kyKtDxr/xcQEN84Wg=
 lukechampine.com/blake3 v1.4.1/go.mod h1:QFosUxmjB8mnrWFSNwKmvxHpfY72bmD2tQ0kBMM3kwo=
 maunium.net/go/maulogger/v2 v2.4.1 h1:N7zSdd0mZkB2m2JtFUsiGTQQAdP0YeFWT7YMc80yAL8=
@@ -1743,3 +1968,5 @@ rsc.io/quote/v3 v3.1.0/go.mod h1:yEA65RcK8LyAZtP9Kv3t0HmxON59tX3rD+tICJqUlj0=
 rsc.io/sampler v1.3.0/go.mod h1:T1hPZKmBbMNahiBKFy5HrXp6adAjACjK9JXDnKaTXpA=
 sigs.k8s.io/yaml v1.6.0 h1:G8fkbMSAFqgEFgh4b1wmtzDnioxFCUgTZhlbj5P9QYs=
 sigs.k8s.io/yaml v1.6.0/go.mod h1:796bPqUfzR/0jLAl6XjHl3Ck7MiyVv8dbTdyT3/pMf4=
+software.sslmate.com/src/go-pkcs12 v0.4.0 h1:H2g08FrTvSFKUj+D309j1DPfk5APnIdAQAB8aEykJ5k=
+software.sslmate.com/src/go-pkcs12 v0.4.0/go.mod h1:Qiz0EyvDRJjjxGyUQa2cCNZn/wMyzrRJ/qcDXOQazLI=
--- a/pkg/downloader/huggingface.go
+++ b/pkg/downloader/huggingface.go
@@ -33,6 +33,7 @@ func HuggingFaceScan(uri URI) (*HuggingFaceScanResult, error) {
 	if err != nil {
 		return nil, err
 	}
+	defer results.Body.Close()
 	if results.StatusCode != 200 {
 		return nil, fmt.Errorf("unexpected status code during HuggingFaceScan: %d", results.StatusCode)
 	}
--- a/pkg/downloader/pinned_ref_internal_test.go
+++ b/pkg/downloader/pinned_ref_internal_test.go
@@ -0,0 +1,29 @@
+// pinnedImageRef is unexported, so its tests live in package downloader
+// (alongside the external _test package's specs — both share Ginkgo's
+// global registry, so the external suite's RunSpecs picks these up too).
+package downloader
+
+import (
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+var _ = Describe("pinnedImageRef", func() {
+	const dig = "sha256:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"
+
+	DescribeTable("rewrites refs to digest form",
+		func(in, want string) {
+			Expect(pinnedImageRef(in, dig)).To(Equal(want))
+		},
+		Entry("repo:tag", "quay.io/foo/bar:latest", "quay.io/foo/bar@"+dig),
+		Entry("repo without tag", "quay.io/foo/bar", "quay.io/foo/bar@"+dig),
+		Entry("dockerhub library tag", "docker.io/library/alpine:3.20", "docker.io/library/alpine@"+dig),
+		// Registry with explicit port: the ':5000' must not be mistaken
+		// for a tag separator.
+		Entry("registry port + tag", "localhost:5000/foo:latest", "localhost:5000/foo@"+dig),
+		Entry("registry port without tag", "localhost:5000/foo", "localhost:5000/foo@"+dig),
+		// Already-digested ref: rewrite cleanly rather than appending.
+		Entry("already digested", "quay.io/foo/bar@sha256:deadbeef", "quay.io/foo/bar@"+dig),
+		Entry("tag and digest", "quay.io/foo/bar:latest@sha256:deadbeef", "quay.io/foo/bar@"+dig),
+	)
+})
--- a/pkg/downloader/uri.go
+++ b/pkg/downloader/uri.go
@@ -39,6 +39,63 @@ const (

 type URI string

+// ImageVerifier verifies the integrity of an OCI image — typically a
+// cosign signature check against a sigstore policy. The downloader runs
+// VerifyImage between fetching the image manifest and extracting its
+// layers, so verification failure prevents any tampered bytes reaching
+// disk.
+//
+// pkg/oci/cosignverify.Verifier satisfies this interface.
+type ImageVerifier interface {
+	VerifyImage(ctx context.Context, imageRef string) error
+}
+
+type downloadOptions struct {
+	verifier ImageVerifier
+}
+
+// DownloadOption configures DownloadFileWithContext / DownloadFile.
+//
+// Variadic at the end of the signature keeps the public API backward
+// compatible: existing callers that don't care about verification keep
+// compiling untouched.
+type DownloadOption func(*downloadOptions)
+
+// WithImageVerifier attaches an ImageVerifier that runs against OCI
+// downloads only. No-op for tarball / HTTP / Ollama / local downloads —
+// those paths use SHA256 integrity instead.
+func WithImageVerifier(v ImageVerifier) DownloadOption {
+	return func(o *downloadOptions) { o.verifier = v }
+}
+
+func applyDownloadOptions(opts []DownloadOption) downloadOptions {
+	var o downloadOptions
+	for _, fn := range opts {
+		fn(&o)
+	}
+	return o
+}
+
+// pinnedImageRef rewrites `repo:tag` (or `repo[@digest]`) into `repo@<digest>`
+// so callers can pass the explicit digest the downloader just resolved to
+// any tag-following client, eliminating TOCTOU between fetches.
+func pinnedImageRef(ref, digest string) string {
+	// Strip an existing @digest if present so we always emit a clean ref.
+	if at := strings.LastIndex(ref, "@"); at != -1 {
+		// Only treat as a digest separator when not preceded by a slash
+		// (avoids breaking unusual hostnames). Conservative: just keep
+		// the registry+repo portion.
+		ref = ref[:at]
+	}
+	// Strip an existing :tag — find the rightmost colon after the last
+	// slash so we don't touch the registry port (e.g. localhost:5000/foo:latest).
+	slash := strings.LastIndex(ref, "/")
+	if colon := strings.LastIndex(ref, ":"); colon > slash {
+		ref = ref[:colon]
+	}
+	return ref + "@" + digest
+}
+
 // HF_ENDPOINT is the HuggingFace endpoint, can be overridden by setting the HF_ENDPOINT environment variable.
 var HF_ENDPOINT string = loadConfig()

@@ -362,11 +419,12 @@ func (u URI) ContentLength(ctx context.Context) (int64, error) {
 	return size, nil
 }

-func (uri URI) DownloadFile(filePath, sha string, fileN, total int, downloadStatus func(string, string, string, float64)) error {
-	return uri.DownloadFileWithContext(context.Background(), filePath, sha, fileN, total, downloadStatus)
+func (uri URI) DownloadFile(filePath, sha string, fileN, total int, downloadStatus func(string, string, string, float64), opts ...DownloadOption) error {
+	return uri.DownloadFileWithContext(context.Background(), filePath, sha, fileN, total, downloadStatus, opts...)
 }

-func (uri URI) DownloadFileWithContext(ctx context.Context, filePath, sha string, fileN, total int, downloadStatus func(string, string, string, float64)) error {
+func (uri URI) DownloadFileWithContext(ctx context.Context, filePath, sha string, fileN, total int, downloadStatus func(string, string, string, float64), opts ...DownloadOption) error {
+	dopts := applyDownloadOptions(opts)
 	url := uri.ResolveURL()
 	if uri.LooksLikeOCI() {

@@ -418,6 +476,23 @@ func (uri URI) DownloadFileWithContext(ctx context.Context, filePath, sha string
 			return fmt.Errorf("failed to get image %q: %v", url, err)
 		}

+		// Verify before extract so tampered bytes never reach disk. We
+		// re-pin the ref to the manifest digest we just fetched: the
+		// verifier would otherwise resolve the tag again, opening a tiny
+		// TOCTOU window in which a registry could swap the underlying
+		// manifest between the two HEADs.
+		if dopts.verifier != nil {
+			digest, derr := img.Digest()
+			if derr != nil {
+				return fmt.Errorf("resolving digest for verification of %q: %v", url, derr)
+			}
+			pinned := pinnedImageRef(url, digest.String())
+			if verr := dopts.verifier.VerifyImage(ctx, pinned); verr != nil {
+				return fmt.Errorf("image verification failed for %q: %w", url, verr)
+			}
+			xlog.Info("Image signature verified", "ref", pinned)
+		}
+
 		return oci.ExtractOCIImage(ctx, img, url, filePath, downloadStatus)
 	}

--- a/pkg/oci/cosignverify/bundle.go
+++ b/pkg/oci/cosignverify/bundle.go
@@ -0,0 +1,115 @@
+// Sigstore-bundle discovery for cosign-signed OCI images.
+//
+// Cosign 2.2+ with `--new-bundle-format --registry-referrers-mode=oci-1-1`
+// stores the signature as a standalone OCI artifact discoverable via the
+// OCI 1.1 referrers API. The artifact payload is a Sigstore protobuf
+// bundle that sigstore-go consumes natively (no manual annotation parsing).
+//
+// go-containerregistry's remote.Referrers transparently falls back to the
+// referrers-tag scheme (`<algo>-<hex>` tag) for registries that don't yet
+// implement the referrers endpoint, so the same code path covers both.
+//
+// We deliberately do not support the legacy `:sha256-<hex>.sig` cosign
+// signature attachment with per-annotation cert/sig/Rekor fields. CI is
+// expected to sign with `--new-bundle-format`; this is a fresh integration
+// and LocalAI controls both the producer (CI) and the consumer (this
+// binary), so there is no reason to carry the legacy path.
+
+package cosignverify
+
+import (
+	"errors"
+	"fmt"
+	"io"
+	"strings"
+
+	"github.com/google/go-containerregistry/pkg/name"
+	v1 "github.com/google/go-containerregistry/pkg/v1"
+	"github.com/google/go-containerregistry/pkg/v1/remote"
+
+	"github.com/sigstore/sigstore-go/pkg/bundle"
+)
+
+// sigstoreBundleMediaTypePrefix matches every published Sigstore bundle
+// version (0.1, 0.2, 0.3, ...). The artifactType lives on the referrer
+// descriptor in the OCI image index returned by the referrers API.
+const sigstoreBundleMediaTypePrefix = "application/vnd.dev.sigstore.bundle."
+
+// isSigstoreBundleArtifactType reports whether the given OCI artifactType
+// identifies a Sigstore bundle blob.
+func isSigstoreBundleArtifactType(mt string) bool {
+	return strings.HasPrefix(mt, sigstoreBundleMediaTypePrefix) && strings.HasSuffix(mt, "+json")
+}
+
+// bundleFromOCISignature locates a cosign-produced Sigstore bundle for the
+// image identified by ref+imageDigest by querying the OCI 1.1 referrers
+// API and returns the parsed bundle.
+//
+// Returns the first bundle whose JSON parses successfully — verification
+// of identity, transparency log inclusion, and artifact digest is the
+// caller's responsibility (driven by the Verifier).
+func bundleFromOCISignature(ref name.Reference, imageDigest v1.Hash, opts []remote.Option) (*bundle.Bundle, error) {
+	digestRef := ref.Context().Digest(imageDigest.String())
+
+	idx, err := remote.Referrers(digestRef, opts...)
+	if err != nil {
+		return nil, fmt.Errorf("cosignverify: querying referrers for %s: %w", digestRef.Name(), err)
+	}
+	manifest, err := idx.IndexManifest()
+	if err != nil {
+		return nil, fmt.Errorf("cosignverify: reading referrers index: %w", err)
+	}
+
+	if len(manifest.Manifests) == 0 {
+		return nil, fmt.Errorf("cosignverify: no referrers found for %s", digestRef.Name())
+	}
+
+	var lastErr error
+	for _, desc := range manifest.Manifests {
+		if !isSigstoreBundleArtifactType(string(desc.ArtifactType)) {
+			continue
+		}
+		b, err := fetchBundleFromReferrer(ref, desc, opts)
+		if err != nil {
+			lastErr = err
+			continue
+		}
+		return b, nil
+	}
+	if lastErr != nil {
+		return nil, fmt.Errorf("cosignverify: no usable Sigstore bundle referrer for %s: %w", digestRef.Name(), lastErr)
+	}
+	return nil, fmt.Errorf("cosignverify: no Sigstore bundle referrer for %s (signed with --new-bundle-format?)", digestRef.Name())
+}
+
+func fetchBundleFromReferrer(ref name.Reference, desc v1.Descriptor, opts []remote.Option) (*bundle.Bundle, error) {
+	artRef := ref.Context().Digest(desc.Digest.String())
+	img, err := remote.Image(artRef, opts...)
+	if err != nil {
+		return nil, fmt.Errorf("fetching referrer image %s: %w", artRef.Name(), err)
+	}
+	layers, err := img.Layers()
+	if err != nil {
+		return nil, fmt.Errorf("reading referrer layers: %w", err)
+	}
+	if len(layers) == 0 {
+		return nil, errors.New("referrer artifact has no layers")
+	}
+
+	rc, err := layers[0].Uncompressed()
+	if err != nil {
+		return nil, fmt.Errorf("opening referrer blob: %w", err)
+	}
+	defer func() { _ = rc.Close() }()
+
+	data, err := io.ReadAll(rc)
+	if err != nil {
+		return nil, fmt.Errorf("reading referrer blob: %w", err)
+	}
+
+	b := &bundle.Bundle{}
+	if err := b.UnmarshalJSON(data); err != nil {
+		return nil, fmt.Errorf("parsing bundle JSON: %w", err)
+	}
+	return b, nil
+}
--- a/pkg/oci/cosignverify/cosignverify_suite_test.go
+++ b/pkg/oci/cosignverify/cosignverify_suite_test.go
@@ -0,0 +1,13 @@
+package cosignverify_test
+
+import (
+	"testing"
+
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+func TestCosignVerify(t *testing.T) {
+	RegisterFailHandler(Fail)
+	RunSpecs(t, "cosignverify test suite")
+}
--- a/pkg/oci/cosignverify/notbefore_internal_test.go
+++ b/pkg/oci/cosignverify/notbefore_internal_test.go
@@ -0,0 +1,58 @@
+// enforceNotBefore is unexported, so its tests live in package
+// cosignverify (alongside the external _test package's specs — both
+// share Ginkgo's global registry, so the external suite's RunSpecs
+// picks these up too).
+package cosignverify
+
+import (
+	"time"
+
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+	"github.com/sigstore/sigstore-go/pkg/verify"
+)
+
+var _ = Describe("enforceNotBefore", func() {
+	cutoff := time.Date(2026, 5, 14, 12, 0, 0, 0, time.UTC)
+
+	makeResult := func(stamps ...time.Time) *verify.VerificationResult {
+		res := &verify.VerificationResult{}
+		for _, ts := range stamps {
+			res.VerifiedTimestamps = append(res.VerifiedTimestamps, verify.TimestampVerificationResult{
+				Type:      "Tlog",
+				URI:       "https://rekor.sigstore.dev",
+				Timestamp: ts,
+			})
+		}
+		return res
+	}
+
+	It("accepts a signature newer than the cutoff", func() {
+		Expect(enforceNotBefore(makeResult(cutoff.Add(time.Hour)), cutoff)).To(Succeed())
+	})
+
+	It("accepts a signature exactly at the cutoff", func() {
+		Expect(enforceNotBefore(makeResult(cutoff), cutoff)).To(Succeed())
+	})
+
+	It("rejects a signature older than the cutoff", func() {
+		err := enforceNotBefore(makeResult(cutoff.Add(-time.Hour)), cutoff)
+		Expect(err).To(HaveOccurred())
+		Expect(err.Error()).To(ContainSubstring("before NotBefore cutoff"))
+	})
+
+	It("rejects when the earliest of several timestamps predates the cutoff", func() {
+		err := enforceNotBefore(makeResult(
+			cutoff.Add(time.Hour),
+			cutoff.Add(-time.Minute),
+			cutoff.Add(2*time.Hour),
+		), cutoff)
+		Expect(err).To(HaveOccurred())
+	})
+
+	It("treats absent timestamps as a hard error", func() {
+		err := enforceNotBefore(makeResult(), cutoff)
+		Expect(err).To(HaveOccurred())
+		Expect(err.Error()).To(ContainSubstring("no verified timestamp"))
+	})
+})
--- a/pkg/oci/cosignverify/verify.go
+++ b/pkg/oci/cosignverify/verify.go
@@ -0,0 +1,326 @@
+// Package cosignverify verifies cosign-signed OCI images using sigstore-go.
+//
+// LocalAI uses this to gate backend installs on a keyless-cosign signature
+// from a trusted GitHub Actions OIDC identity, so a registry/tag compromise
+// alone is not sufficient to ship a tampered backend image.
+//
+// Producer side: CI signs each pushed backend image with cosign 2.2+ and
+// the `--new-bundle-format --registry-referrers-mode=oci-1-1` flags. The
+// signature is then a standalone Sigstore bundle stored as an OCI 1.1
+// referrer of the image manifest.
+//
+// Consumer side (this package): bundle.go discovers the bundle via the
+// referrers API and hands it directly to sigstore-go's verifier. There is
+// no legacy-cosign-annotation fallback — we own both ends.
+package cosignverify
+
+import (
+	"context"
+	"encoding/hex"
+	"errors"
+	"fmt"
+	"net/http"
+	"sync"
+	"time"
+
+	registrytypes "github.com/docker/docker/api/types/registry"
+	"github.com/google/go-containerregistry/pkg/authn"
+	"github.com/google/go-containerregistry/pkg/name"
+	v1 "github.com/google/go-containerregistry/pkg/v1"
+	"github.com/google/go-containerregistry/pkg/v1/remote"
+	"github.com/google/go-containerregistry/pkg/v1/remote/transport"
+
+	"github.com/sigstore/sigstore-go/pkg/root"
+	"github.com/sigstore/sigstore-go/pkg/tuf"
+	"github.com/sigstore/sigstore-go/pkg/verify"
+)
+
+// Policy is the verification policy a backend image must satisfy.
+//
+// At least one of Issuer / IssuerRegex must be set, and at least one of
+// Identity / IdentityRegex. The (Issuer, Identity) pair pins which OIDC
+// principal Fulcio issued the signing cert to — for GitHub Actions keyless
+// signing this is typically:
+//
+//	Issuer:        "https://token.actions.githubusercontent.com"
+//	IdentityRegex: "^https://github.com/<org>/<repo>/\\.github/workflows/<file>@refs/.*"
+//
+// A registry compromise alone cannot satisfy this; the attacker would also
+// need to compromise the GitHub Actions OIDC identity to obtain a Fulcio
+// cert with a matching SAN.
+type Policy struct {
+	Issuer        string
+	IssuerRegex   string
+	Identity      string
+	IdentityRegex string
+
+	// TUFRootURL overrides the default sigstore public-good TUF mirror
+	// (tuf-repo-cdn.sigstore.dev). Leave empty for the public good.
+	TUFRootURL string
+
+	// TUFCachePath overrides the on-disk cache directory for the TUF
+	// metadata. Leave empty for the sigstore-go default.
+	TUFCachePath string
+
+	// RequireTLog requires an inclusion proof from the Rekor transparency
+	// log. Defaults to true; only disable for testing.
+	RequireTLog *bool
+
+	// RequireSCT requires the signing certificate to embed a Signed
+	// Certificate Timestamp from the certificate-transparency log.
+	// Defaults to true.
+	RequireSCT *bool
+
+	// NotBefore rejects signatures whose Rekor integrated time is older
+	// than this. This is the revocation lever: keyless cosign certs are
+	// ephemeral so there is no CA-side revocation, but advancing NotBefore
+	// in the gallery YAML invalidates any signature produced before a
+	// known compromise window. Zero value means no time-based cutoff.
+	NotBefore time.Time
+}
+
+func boolOrTrue(b *bool) bool {
+	if b == nil {
+		return true
+	}
+	return *b
+}
+
+// Validate returns an error if the policy is missing required fields.
+func (p Policy) Validate() error {
+	if p.Issuer == "" && p.IssuerRegex == "" {
+		return errors.New("cosignverify: policy must set Issuer or IssuerRegex")
+	}
+	if p.Identity == "" && p.IdentityRegex == "" {
+		return errors.New("cosignverify: policy must set Identity or IdentityRegex")
+	}
+	return nil
+}
+
+// Verifier verifies cosign-signed OCI images against a fixed Policy.
+//
+// Cheap to construct, safe for concurrent use. The TUF trusted root is
+// fetched once per (root URL, cache path) tuple across all Verifiers in
+// the process — installing N backends from the same gallery does one TUF
+// fetch, not N.
+type Verifier struct {
+	policy Policy
+
+	// Registry plumbing — reused from the existing pkg/oci surface so we
+	// honor the same auth / transport conventions.
+	auth      *registrytypes.AuthConfig
+	transport http.RoundTripper
+}
+
+// NewVerifier constructs a Verifier. The trusted root is not fetched yet;
+// it is loaded on the first call to VerifyImage. auth and t may be nil.
+func NewVerifier(p Policy, auth *registrytypes.AuthConfig, t http.RoundTripper) (*Verifier, error) {
+	if err := p.Validate(); err != nil {
+		return nil, err
+	}
+	return &Verifier{policy: p, auth: auth, transport: t}, nil
+}
+
+// trustedMaterialCacheKey identifies which TUF mirror + on-disk cache a
+// Verifier wants. Two Verifiers with identical keys share trusted material.
+type trustedMaterialCacheKey struct {
+	URL  string
+	Path string
+}
+
+type trustedMaterialEntry struct {
+	once     sync.Once
+	material root.TrustedMaterialCollection
+	err      error
+}
+
+var trustedMaterialCache sync.Map // map[trustedMaterialCacheKey]*trustedMaterialEntry
+
+func (v *Verifier) loadTrustedMaterial() (root.TrustedMaterialCollection, error) {
+	key := trustedMaterialCacheKey{URL: v.policy.TUFRootURL, Path: v.policy.TUFCachePath}
+	val, _ := trustedMaterialCache.LoadOrStore(key, &trustedMaterialEntry{})
+	entry := val.(*trustedMaterialEntry)
+	entry.once.Do(func() {
+		opts := tuf.DefaultOptions()
+		if v.policy.TUFRootURL != "" {
+			opts.RepositoryBaseURL = v.policy.TUFRootURL
+		}
+		if v.policy.TUFCachePath != "" {
+			opts.CachePath = v.policy.TUFCachePath
+		}
+		client, err := tuf.New(opts)
+		if err != nil {
+			entry.err = fmt.Errorf("cosignverify: initialising TUF client: %w", err)
+			return
+		}
+		trustedRootJSON, err := client.GetTarget("trusted_root.json")
+		if err != nil {
+			entry.err = fmt.Errorf("cosignverify: fetching trusted_root.json: %w", err)
+			return
+		}
+		tr, err := root.NewTrustedRootFromJSON(trustedRootJSON)
+		if err != nil {
+			entry.err = fmt.Errorf("cosignverify: parsing trusted root: %w", err)
+			return
+		}
+		entry.material = root.TrustedMaterialCollection{tr}
+	})
+	return entry.material, entry.err
+}
+
+// VerifyImage resolves imageRef to its manifest digest, fetches the cosign
+// signature attachment (the conventional `:sha256-<hex>.sig` tag), assembles
+// a Sigstore bundle from the cosign annotations, and verifies that bundle
+// against the configured Policy.
+//
+// Returns nil on the first signature in the attachment that satisfies the
+// policy. Returns an error if none do, or if any part of the fetch fails.
+func (v *Verifier) VerifyImage(ctx context.Context, imageRef string) error {
+	if err := ctx.Err(); err != nil {
+		return err
+	}
+
+	trusted, err := v.loadTrustedMaterial()
+	if err != nil {
+		return err
+	}
+
+	ref, err := name.ParseReference(imageRef)
+	if err != nil {
+		return fmt.Errorf("cosignverify: parse image ref %q: %w", imageRef, err)
+	}
+
+	opts := v.remoteOptions(ctx)
+
+	// Resolve the image to its manifest digest. With the new-bundle-format
+	// flow the cosign signature is taken over the manifest digest directly,
+	// so this is also the artifact we ask the verifier to bind against.
+	// Skip the HEAD when the ref is already digest-pinned (the typical
+	// path from pkg/downloader, which resolves the digest before calling
+	// us): name.ParseReference returns a name.Digest in that case.
+	var digest v1.Hash
+	if d, ok := ref.(name.Digest); ok {
+		h, herr := v1.NewHash(d.DigestStr())
+		if herr != nil {
+			return fmt.Errorf("cosignverify: parsing pinned digest %q: %w", d.DigestStr(), herr)
+		}
+		digest = h
+	} else {
+		desc, herr := remote.Head(ref, opts...)
+		if herr != nil {
+			return fmt.Errorf("cosignverify: resolving image descriptor: %w", herr)
+		}
+		digest = desc.Digest
+	}
+
+	bun, err := bundleFromOCISignature(ref, digest, opts)
+	if err != nil {
+		return err
+	}
+
+	verifierOpts := []verify.VerifierOption{}
+	if boolOrTrue(v.policy.RequireSCT) {
+		verifierOpts = append(verifierOpts, verify.WithSignedCertificateTimestamps(1))
+	}
+	if boolOrTrue(v.policy.RequireTLog) {
+		verifierOpts = append(verifierOpts, verify.WithTransparencyLog(1))
+		verifierOpts = append(verifierOpts, verify.WithObserverTimestamps(1))
+	}
+
+	certID, err := verify.NewShortCertificateIdentity(
+		v.policy.Issuer,
+		v.policy.IssuerRegex,
+		v.policy.Identity,
+		v.policy.IdentityRegex,
+	)
+	if err != nil {
+		return fmt.Errorf("cosignverify: building identity policy: %w", err)
+	}
+
+	sev, err := verify.NewVerifier(trusted, verifierOpts...)
+	if err != nil {
+		return fmt.Errorf("cosignverify: constructing verifier: %w", err)
+	}
+
+	artifactDigest, err := hex.DecodeString(digest.Hex)
+	if err != nil {
+		return fmt.Errorf("cosignverify: decoding image digest: %w", err)
+	}
+	artifactPolicy := verify.WithArtifactDigest(digest.Algorithm, artifactDigest)
+
+	result, err := sev.Verify(bun, verify.NewPolicy(artifactPolicy, verify.WithCertificateIdentity(certID)))
+	if err != nil {
+		return fmt.Errorf("cosignverify: verification failed for %s: %w", imageRef, err)
+	}
+
+	if !v.policy.NotBefore.IsZero() {
+		if err := enforceNotBefore(result, v.policy.NotBefore); err != nil {
+			return fmt.Errorf("cosignverify: %s: %w", imageRef, err)
+		}
+	}
+	return nil
+}
+
+// enforceNotBefore rejects a verification result whose earliest verified
+// timestamp predates cutoff. Used as a revocation lever — see Policy.NotBefore.
+func enforceNotBefore(result *verify.VerificationResult, cutoff time.Time) error {
+	if result == nil || len(result.VerifiedTimestamps) == 0 {
+		// Defensive: with RequireTLog=true (the default) sigstore-go will
+		// have already failed verification if there was no verifiable
+		// timestamp, so this branch is only reachable if a caller set
+		// RequireTLog=false. Treat as a hard error: if you opted into
+		// NotBefore, you implicitly opted into needing a timestamp.
+		return errors.New("signature has no verified timestamp; cannot enforce NotBefore")
+	}
+	earliest := result.VerifiedTimestamps[0].Timestamp
+	for _, ts := range result.VerifiedTimestamps[1:] {
+		if ts.Timestamp.Before(earliest) {
+			earliest = ts.Timestamp
+		}
+	}
+	if earliest.Before(cutoff) {
+		return fmt.Errorf("signature integrated time %s is before NotBefore cutoff %s",
+			earliest.Format(time.RFC3339), cutoff.Format(time.RFC3339))
+	}
+	return nil
+}
+
+func (v *Verifier) remoteOptions(ctx context.Context) []remote.Option {
+	t := v.transport
+	if t == nil {
+		t = http.DefaultTransport
+	}
+	// Match the retry policy used elsewhere in pkg/oci so transient
+	// registry hiccups don't fail verification.
+	t = transport.NewRetry(t)
+
+	opts := []remote.Option{
+		remote.WithContext(ctx),
+		remote.WithTransport(t),
+	}
+	if v.auth != nil {
+		opts = append(opts, remote.WithAuth(staticAuth{auth: v.auth}))
+	} else {
+		opts = append(opts, remote.WithAuthFromKeychain(authn.DefaultKeychain))
+	}
+	return opts
+}
+
+// staticAuth mirrors pkg/oci's adapter so callers can pass the same
+// docker auth config they use everywhere else.
+type staticAuth struct {
+	auth *registrytypes.AuthConfig
+}
+
+func (s staticAuth) Authorization() (*authn.AuthConfig, error) {
+	if s.auth == nil {
+		return nil, nil
+	}
+	return &authn.AuthConfig{
+		Username:      s.auth.Username,
+		Password:      s.auth.Password,
+		Auth:          s.auth.Auth,
+		IdentityToken: s.auth.IdentityToken,
+		RegistryToken: s.auth.RegistryToken,
+	}, nil
+}
--- a/pkg/oci/cosignverify/verify_test.go
+++ b/pkg/oci/cosignverify/verify_test.go
@@ -0,0 +1,98 @@
+package cosignverify_test
+
+import (
+	"context"
+	"os"
+	"time"
+
+	"github.com/mudler/LocalAI/pkg/oci/cosignverify"
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+var _ = Describe("Policy", func() {
+	It("rejects an empty policy", func() {
+		_, err := cosignverify.NewVerifier(cosignverify.Policy{}, nil, nil)
+		Expect(err).To(HaveOccurred())
+	})
+
+	It("rejects a policy missing the identity", func() {
+		_, err := cosignverify.NewVerifier(cosignverify.Policy{
+			Issuer: "https://token.actions.githubusercontent.com",
+		}, nil, nil)
+		Expect(err).To(HaveOccurred())
+	})
+
+	It("rejects a policy missing the issuer", func() {
+		_, err := cosignverify.NewVerifier(cosignverify.Policy{
+			IdentityRegex: "^https://github.com/example/.*",
+		}, nil, nil)
+		Expect(err).To(HaveOccurred())
+	})
+
+	It("constructs a verifier given a complete policy", func() {
+		v, err := cosignverify.NewVerifier(cosignverify.Policy{
+			Issuer:        "https://token.actions.githubusercontent.com",
+			IdentityRegex: `^https://github.com/example/.*`,
+		}, nil, nil)
+		Expect(err).NotTo(HaveOccurred())
+		Expect(v).NotTo(BeNil())
+	})
+})
+
+// Live tests hit the public Sigstore TUF mirror, the source registry, and
+// (for positive cases) the Rekor log. Too flaky for the default suite —
+// gate on LOCALAI_COSIGN_LIVE=1.
+var _ = Describe("VerifyImage", func() {
+	BeforeEach(func() {
+		if os.Getenv("LOCALAI_COSIGN_LIVE") == "" {
+			Skip("set LOCALAI_COSIGN_LIVE=1 to run live cosign verification")
+		}
+	})
+
+	It("rejects an image without a Sigstore bundle referrer", func() {
+		v, err := cosignverify.NewVerifier(cosignverify.Policy{
+			Issuer:        "https://token.actions.githubusercontent.com",
+			IdentityRegex: `^https://github\.com/example/.*`,
+		}, nil, nil)
+		Expect(err).NotTo(HaveOccurred())
+
+		ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
+		defer cancel()
+
+		// alpine:latest is unsigned; the referrers API returns an empty
+		// (or 404 → empty) index, so we should see "no referrers" or
+		// "no bundle referrer" rather than a hard parse error.
+		err = v.VerifyImage(ctx, "alpine:latest")
+		Expect(err).To(HaveOccurred())
+	})
+
+	// End-to-end positive test. Requires:
+	//   LOCALAI_COSIGN_LIVE=1
+	//   LOCALAI_COSIGN_LIVE_IMAGE=<image-ref-signed-with-new-bundle-format>
+	//   LOCALAI_COSIGN_LIVE_ISSUER=<expected OIDC issuer>
+	//   LOCALAI_COSIGN_LIVE_IDENTITY_REGEX=<expected identity SAN regex>
+	//
+	// No defaults — we don't have a stable third-party image known to be
+	// signed in the new-bundle-format yet. Once the local-ai-backends CI
+	// is signing images, plug one of those refs in here.
+	It("verifies a signed image when LOCALAI_COSIGN_LIVE_IMAGE is set", func() {
+		image := os.Getenv("LOCALAI_COSIGN_LIVE_IMAGE")
+		issuer := os.Getenv("LOCALAI_COSIGN_LIVE_ISSUER")
+		identityRegex := os.Getenv("LOCALAI_COSIGN_LIVE_IDENTITY_REGEX")
+		if image == "" || issuer == "" || identityRegex == "" {
+			Skip("set LOCALAI_COSIGN_LIVE_IMAGE / _ISSUER / _IDENTITY_REGEX to run the positive case")
+		}
+
+		v, err := cosignverify.NewVerifier(cosignverify.Policy{
+			Issuer:        issuer,
+			IdentityRegex: identityRegex,
+		}, nil, nil)
+		Expect(err).NotTo(HaveOccurred())
+
+		ctx, cancel := context.WithTimeout(context.Background(), 90*time.Second)
+		defer cancel()
+
+		Expect(v.VerifyImage(ctx, image)).To(Succeed())
+	})
+})
--- a/pkg/utils/untar.go
+++ b/pkg/utils/untar.go
@@ -1,9 +1,13 @@
 package utils

 import (
+	"archive/tar"
 	"fmt"
 	"os"
+	"path/filepath"
+	"strings"

+	"github.com/klauspost/compress/zip"
 	"github.com/mholt/archiver/v3"
 )

@@ -54,7 +58,15 @@ func ExtractArchive(archive, dst string) error {
 		v.Tar = mytar
 	}

+	extractRoot, err := filepath.Abs(dst)
+	if err != nil {
+		return err
+	}
+
 	err = archiver.Walk(archive, func(f archiver.File) error {
+		if err := validateArchiveMemberPath(extractRoot, archiveMemberName(f)); err != nil {
+			return err
+		}
 		if f.FileInfo.Mode()&os.ModeSymlink != 0 {
 			return fmt.Errorf("archive contains a symlink")
 		}
@@ -67,3 +79,41 @@ func ExtractArchive(archive, dst string) error {

 	return un.Unarchive(archive, dst)
 }
+
+func archiveMemberName(f archiver.File) string {
+	switch h := f.Header.(type) {
+	case tar.Header:
+		return h.Name
+	case *tar.Header:
+		return h.Name
+	case zip.FileHeader:
+		return h.Name
+	case *zip.FileHeader:
+		return h.Name
+	default:
+		return f.Name()
+	}
+}
+
+func validateArchiveMemberPath(root, name string) error {
+	if name == "" {
+		return fmt.Errorf("archive contains an empty path")
+	}
+
+	normalizedName := filepath.FromSlash(strings.ReplaceAll(name, "\\", "/"))
+	cleanedName := filepath.Clean(normalizedName)
+	if filepath.IsAbs(cleanedName) || cleanedName == ".." || strings.HasPrefix(cleanedName, ".."+string(os.PathSeparator)) {
+		return fmt.Errorf("archive contains an unsafe path: %s", name)
+	}
+
+	targetPath := filepath.Join(root, cleanedName)
+	relativePath, err := filepath.Rel(root, targetPath)
+	if err != nil {
+		return err
+	}
+	if relativePath == ".." || strings.HasPrefix(relativePath, ".."+string(os.PathSeparator)) || filepath.IsAbs(relativePath) {
+		return fmt.Errorf("archive contains an unsafe path: %s", name)
+	}
+
+	return nil
+}
--- a/pkg/utils/untar_test.go
+++ b/pkg/utils/untar_test.go
@@ -0,0 +1,128 @@
+package utils_test
+
+import (
+	"archive/tar"
+	"archive/zip"
+	"os"
+	"path/filepath"
+
+	. "github.com/mudler/LocalAI/pkg/utils"
+	. "github.com/onsi/ginkgo/v2"
+	. "github.com/onsi/gomega"
+)
+
+var _ = Describe("utils/archive tests", func() {
+	It("extracts regular nested zip members", func() {
+		tmpDir := GinkgoT().TempDir()
+		archivePath := filepath.Join(tmpDir, "model.zip")
+		extractPath := filepath.Join(tmpDir, "models")
+
+		Expect(writeZipArchive(archivePath, map[string]string{
+			"nested/model.yaml": "name: test",
+		})).To(Succeed())
+
+		Expect(ExtractArchive(archivePath, extractPath)).To(Succeed())
+
+		extracted, err := os.ReadFile(filepath.Join(extractPath, "nested", "model.yaml"))
+		Expect(err).ToNot(HaveOccurred())
+		Expect(string(extracted)).To(Equal("name: test"))
+	})
+
+	It("rejects zip members that escape the destination", func() {
+		tmpDir := GinkgoT().TempDir()
+		archivePath := filepath.Join(tmpDir, "model.zip")
+		extractPath := filepath.Join(tmpDir, "models")
+
+		Expect(writeZipArchive(archivePath, map[string]string{
+			"../escaped.txt": "escaped",
+		})).To(Succeed())
+
+		err := ExtractArchive(archivePath, extractPath)
+
+		Expect(err).To(HaveOccurred())
+		Expect(err.Error()).To(ContainSubstring("unsafe path"))
+		Expect(filepath.Join(tmpDir, "escaped.txt")).ToNot(BeAnExistingFile())
+	})
+
+	It("rejects tar members that escape the destination", func() {
+		tmpDir := GinkgoT().TempDir()
+		archivePath := filepath.Join(tmpDir, "model.tar")
+		extractPath := filepath.Join(tmpDir, "models")
+
+		Expect(writeTarArchive(archivePath, map[string]string{
+			"../escaped.txt": "escaped",
+		})).To(Succeed())
+
+		err := ExtractArchive(archivePath, extractPath)
+
+		Expect(err).To(HaveOccurred())
+		Expect(err.Error()).To(ContainSubstring("unsafe path"))
+		Expect(filepath.Join(tmpDir, "escaped.txt")).ToNot(BeAnExistingFile())
+	})
+})
+
+func writeZipArchive(path string, files map[string]string) (err error) {
+	out, err := os.Create(path)
+	if err != nil {
+		return err
+	}
+	defer func() {
+		if closeErr := out.Close(); err == nil {
+			err = closeErr
+		}
+	}()
+
+	writer := zip.NewWriter(out)
+	defer func() {
+		if closeErr := writer.Close(); err == nil {
+			err = closeErr
+		}
+	}()
+
+	for name, contents := range files {
+		fileWriter, err := writer.Create(name)
+		if err != nil {
+			return err
+		}
+		if _, err := fileWriter.Write([]byte(contents)); err != nil {
+			return err
+		}
+	}
+
+	return nil
+}
+
+func writeTarArchive(path string, files map[string]string) (err error) {
+	out, err := os.Create(path)
+	if err != nil {
+		return err
+	}
+	defer func() {
+		if closeErr := out.Close(); err == nil {
+			err = closeErr
+		}
+	}()
+
+	writer := tar.NewWriter(out)
+	defer func() {
+		if closeErr := writer.Close(); err == nil {
+			err = closeErr
+		}
+	}()
+
+	for name, contents := range files {
+		data := []byte(contents)
+		if err := writer.WriteHeader(&tar.Header{
+			Name: name,
+			Mode: 0o600,
+			Size: int64(len(data)),
+		}); err != nil {
+			return err
+		}
+		if _, err := writer.Write(data); err != nil {
+			return err
+		}
+	}
+
+	return nil
+}
--- a/swagger/docs.go
+++ b/swagger/docs.go
@@ -5347,6 +5347,14 @@ const docTemplate = `{
                "stream": {
                    "type": "boolean"
                },
+                "stream_options": {
+                    "description": "StreamOptions opts into OpenAI streaming extensions, e.g. include_usage.",
+                    "allOf": [
+                        {
+                            "$ref": "#/definitions/schema.StreamOptions"
+                        }
+                    ]
+                },
                "temperature": {
                    "type": "number"
                },
@@ -5412,7 +5420,12 @@ const docTemplate = `{
                    "type": "string"
                },
                "usage": {
-                    "$ref": "#/definitions/schema.OpenAIUsage"
+                    "description": "Usage is intentionally a pointer with omitempty: per the OpenAI\nchat-completion streaming spec, intermediate chunks must not carry\na ` + "`" + `usage` + "`" + ` field. Marshalling a value-typed usage would emit\n` + "`" + `\"usage\":{\"prompt_tokens\":0,...}` + "`" + ` on every chunk and break\nOpenAI-SDK consumers that filter on a truthy ` + "`" + `result.usage` + "`" + `\n(continuedev/continue, Kilo Code, Roo Code, etc.).",
+                    "allOf": [
+                        {
+                            "$ref": "#/definitions/schema.OpenAIUsage"
+                        }
+                    ]
                }
            }
        },
@@ -5578,6 +5591,14 @@ const docTemplate = `{
                }
            }
        },
+        "schema.StreamOptions": {
+            "type": "object",
+            "properties": {
+                "include_usage": {
+                    "type": "boolean"
+                }
+            }
+        },
        "schema.SysInfoModel": {
            "type": "object",
            "properties": {
--- a/swagger/swagger.json
+++ b/swagger/swagger.json
@@ -5344,6 +5344,14 @@
                "stream": {
                    "type": "boolean"
                },
+                "stream_options": {
+                    "description": "StreamOptions opts into OpenAI streaming extensions, e.g. include_usage.",
+                    "allOf": [
+                        {
+                            "$ref": "#/definitions/schema.StreamOptions"
+                        }
+                    ]
+                },
                "temperature": {
                    "type": "number"
                },
@@ -5409,7 +5417,12 @@
                    "type": "string"
                },
                "usage": {
-                    "$ref": "#/definitions/schema.OpenAIUsage"
+                    "description": "Usage is intentionally a pointer with omitempty: per the OpenAI\nchat-completion streaming spec, intermediate chunks must not carry\na `usage` field. Marshalling a value-typed usage would emit\n`\"usage\":{\"prompt_tokens\":0,...}` on every chunk and break\nOpenAI-SDK consumers that filter on a truthy `result.usage`\n(continuedev/continue, Kilo Code, Roo Code, etc.).",
+                    "allOf": [
+                        {
+                            "$ref": "#/definitions/schema.OpenAIUsage"
+                        }
+                    ]
                }
            }
        },
@@ -5575,6 +5588,14 @@
                }
            }
        },
+        "schema.StreamOptions": {
+            "type": "object",
+            "properties": {
+                "include_usage": {
+                    "type": "boolean"
+                }
+            }
+        },
        "schema.SysInfoModel": {
            "type": "object",
            "properties": {
--- a/swagger/swagger.yaml
+++ b/swagger/swagger.yaml
@@ -1650,6 +1650,10 @@ definitions:
      stop: {}
      stream:
        type: boolean
+      stream_options:
+        allOf:
+        - $ref: '#/definitions/schema.StreamOptions'
+        description: StreamOptions opts into OpenAI streaming extensions, e.g. include_usage.
      temperature:
        type: number
      tfz:
@@ -1698,7 +1702,15 @@ definitions:
      object:
        type: string
      usage:
-        $ref: '#/definitions/schema.OpenAIUsage'
+        allOf:
+        - $ref: '#/definitions/schema.OpenAIUsage'
+        description: |-
+          Usage is intentionally a pointer with omitempty: per the OpenAI
+          chat-completion streaming spec, intermediate chunks must not carry
+          a `usage` field. Marshalling a value-typed usage would emit
+          `"usage":{"prompt_tokens":0,...}` on every chunk and break
+          OpenAI-SDK consumers that filter on a truthy `result.usage`
+          (continuedev/continue, Kilo Code, Roo Code, etc.).
    type: object
  schema.OpenAIUsage:
    properties:
@@ -1813,6 +1825,11 @@ definitions:
          $ref: '#/definitions/schema.NodeData'
        type: array
    type: object
+  schema.StreamOptions:
+    properties:
+      include_usage:
+        type: boolean
+    type: object
  schema.SysInfoModel:
    properties:
      id: