Files
LocalAI/backend/cpp/llama-cpp-localai-paged/docs/final_benchmark.csv
Ettore Di Giacinto 08b754f910 chore(paged): keep patches/ patch-only; README to backend root, docs to docs/
The llama-cpp-localai-paged patches/ dir had accumulated docs, plots, a csv,
dev .cpp harnesses, and a dead FP4-MoE kernel scaffold after an earlier git-mv.
Restore the invariant that patches/ holds only the .patch series.

Moves:
- patches/paged/README.md -> README.md (canonical doc at the backend root)
- patches/paged/{PIN_SYNC_c299a92c,PAGED_BITEXACT_NOTE,LOCALAI_LLAMACPP_BACKEND_PLAN,UPSTREAM_LAYER2_SCOPE}.md,
  final_benchmark.csv, qwen36_*.png, paged-burst-bench.cpp, paged-reclaim-unit.cpp -> docs/
- patches/README.md -> docs/PATCH_MAINTENANCE.md (unique patch-regen recipe not in the canonical README)

Deletes:
- patches/BENCHMARKS.md (superseded by README section 4 + the dev-notes section)
- patches/kernel/ (dead FP4-MoE scaffold, never in the 0001-0030 apply glob, zero refs repo-wide)

Repoint every reference to the moved files: README internal links (docs/ + the
.github links drop from 5x ../ to 3x ../), .agents/llama-cpp-localai-paged-backend.md,
.github/scripts/paged-canary-apply.sh, .github/workflows/llama-cpp-paged-canary.yml,
the wrapper Makefile, backend/cpp/llama-cpp/grpc-server.cpp, backend/index.yaml,
docs/content/features/backends.md, gallery/index.yaml.

The build apply glob PAGED_PATCHES_DIR/0*.patch (PAGED_PATCHES_DIR := .../patches/paged)
is unchanged and still resolves to the 28 patches.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-27 13:20:05 +00:00

983 B

1modelenginenpldecode_agg_tpsdecode_perseq_tpsprefill_tpsttft_mean_mspeak_gb
2q36-27b-nvfp4llama882.59.57507.36038.153.51
3q36-27b-nvfp4llama32192.64.79115.0133551.769.63
4q36-27b-nvfp4llama64277.83.0995.9321618.883.96
5q36-27b-nvfp4llama128384.61.8669.7902762.793.82
6q36-27b-nvfp4vllm870.48.762096.21861.1110.92
7q36-27b-nvfp4vllm32211.86.282182.65353.2110.87
8q36-27b-nvfp4vllm64309.14.382088.99512.4110.88
9q36-27b-nvfp4vllm128418.82.791929.118449.5110.95
10q36-35b-a3b-nvfp4llama8211.824.451236.42477.139.66
11q36-35b-a3b-nvfp4llama32393.010.021213.98225.247.11
12q36-35b-a3b-nvfp4llama64527.06.151152.315849.557.13
13q36-35b-a3b-nvfp4llama128726.43.73276.8213017.261.51
14q36-35b-a3b-nvfp4vllm8256.531.845186.5768.8109.62
15q36-35b-a3b-nvfp4vllm32500.814.906223.41830.4109.63
16q36-35b-a3b-nvfp4vllm64686.19.835926.53224.4109.63
17q36-35b-a3b-nvfp4vllm128882.26.055300.56487.7109.64