Files
LocalAI/backend/cpp/llama-cpp-localai-paged/docs
Ettore Di Giacinto b2784ccbca docs(paged): fix EXECUTION_REARCH_SCOPE seam citations to fork 1edddc8fe
Adversarial verification against the canonical fork mudler/llama.cpp:localai-paged
HEAD 1edddc8fe found the scope doc's section-3 seam references were anchored to
the abandoned pre-trim tree 237ad9b96, which the immediately-preceding commit
b529cc5420 reset away. Two classes of defect, both corrected:

- Phantom scaffolding (honesty): the doc claimed "the team has already started
  scaffolding P1 and P3" citing four commits (237ad9b96 bf16 GDN state cache,
  afc2c7030 act-quant trace, ea0875d14 LLAMA_BF16_CUBLAS_F32_OUT, 7967ad47f
  W4A16 direct-A stub) that b529cc5420 TRIMMED - none exist at 1edddc8fe (git
  cat-file: not a valid object). w4a16-policy.h, test-cuda-w4a16-policy.cpp and
  ggml_cuda_mul_mat_id_w4a16_grouped_direct_a are absent from the tree. Reworded
  P1 plank-1 and the P3 mechanism/files/effort to say these must be re-introduced
  on top of the surviving grouped W4A16 path (patch 0035), not "finished".

- Stale line numbers (additivity): every file:line was off (computed against the
  larger 237ad9b96 tree). Re-anchored to 1edddc8fe: ggml_cuda_try_fuse 4232 (was
  4661), capture loop 4908 (was 5444), moe whole-pattern matcher 4157 (was 4678),
  routed_ffn_poc moe-ffn.cu:275 (was 456), grouped W4A16 hook ggml-cuda.cu:2797
  (was 3093/3188; the direct-A hooks 3085/3171 never existed), concurrent_event
  machinery 4769 (was 5305-5318), continuous-batch budget server-context.cpp
  3083-3135 with LLAMA_MAX_BATCH_TOKENS at 3105 / prefill_budget_step at 3113
  (was 3122-3200).

Numbers (attribution table, recovery arithmetic), the six P0 kill-gates, and the
unreachable-floor honesty were verified sound and left unchanged.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-07-02 11:03:07 +00:00
..