diff --git a/backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md b/backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md index bb32b97fa..a0982e27c 100644 --- a/backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md +++ b/backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md @@ -16,6 +16,43 @@ Status: in progress. No baseline runs have been started yet. +## Clean Build + +First clean build attempt: + +- PID: `625392` +- Source checkout: `~/llama-paged-reopen-clean` +- Result: failed during CMake configure. +- Root cause: `nvcc` was not discoverable on PATH. CUDA headers were found under + `/usr/local/cuda/targets/sbsa-linux/include`, and the compiler exists at + `/usr/local/cuda-13.0/bin/nvcc`. +- Retry plan: rebuild the clean checkout with + `CUDACXX=/usr/local/cuda-13.0/bin/nvcc`. + +Second clean build attempt: + +- PID: `631100` +- Source checkout: `~/llama-paged-reopen-clean` +- Source status: `## HEAD (no branch)` +- Build HEAD: `51168c5eee2e35348d9006f0b2fab3dc6e7c01cc` +- CUDA compiler: `/usr/local/cuda-13.0/bin/nvcc` +- Result: succeeded. +- Binary mtimes: + - `build-cuda/bin/llama-server 2026-06-30 22:14:34.091312112 +0200` + - `build-cuda/bin/llama-batched-bench 2026-06-30 22:14:35.156287566 +0200` + - `build-cuda/bin/llama-completion 2026-06-30 22:14:37.095750242 +0200` + - `build-cuda/bin/test-backend-ops 2026-06-30 22:14:47.360078186 +0200` + +## Canonical Gates + +- MoE greedy md5: `8cb0ce23777bf55f92f63d0292c756b0` (matched expected) +- Dense greedy md5: `5951a5b4d624ce891e22ab5fca9bc439` (matched expected) +- Artifacts: + - `~/bench/reopen_phase0/gate_moe.txt` + - `~/bench/reopen_phase0/gate_moe.md5` + - `~/bench/reopen_phase0/gate_dense.txt` + - `~/bench/reopen_phase0/gate_dense.md5` + ## Source Provenance - Local llama.cpp fork: `/home/mudler/_git/llama.cpp` diff --git a/docs/superpowers/plans/2026-06-30-gb10-parity-reopen.md b/docs/superpowers/plans/2026-06-30-gb10-parity-reopen.md index 08eb76fed..04044f7ef 100644 --- a/docs/superpowers/plans/2026-06-30-gb10-parity-reopen.md +++ b/docs/superpowers/plans/2026-06-30-gb10-parity-reopen.md @@ -36,7 +36,7 @@ **Files:** - Create: `backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md` -- [ ] **Step 1: Confirm the current worktree state** +- [x] **Step 1: Confirm the current worktree state** Run: @@ -52,7 +52,7 @@ Expected: ?? .claude/ ``` -- [ ] **Step 2: Run DGX preflight without starting workloads** +- [x] **Step 2: Run DGX preflight without starting workloads** Run: @@ -80,7 +80,7 @@ gpu lock is FREE or NO_OWNER DGX source states are recorded, even if dirty ``` -- [ ] **Step 3: Create the Phase 0 artifact directory on DGX** +- [x] **Step 3: Create the Phase 0 artifact directory on DGX** Run: @@ -101,7 +101,7 @@ Expected: ~/bench/reopen_phase0 exists and contains created_utc.txt, hostname.txt, docker_ps.txt, compute_apps.txt, gpu_lock_owner.txt ``` -- [ ] **Step 4: Write the initial Phase 0 results document from captured values** +- [x] **Step 4: Write the initial Phase 0 results document from captured values** Run: @@ -139,7 +139,7 @@ No baseline runs have been started yet. EOF ``` -- [ ] **Step 5: Commit Task 1** +- [x] **Step 5: Commit Task 1** Run: @@ -161,7 +161,7 @@ Commit succeeds with only GB10_PARITY_PHASE0_RESULTS.md staged. **Files:** - Modify: `backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md` -- [ ] **Step 1: Record local source truth** +- [x] **Step 1: Record local source truth** Run: @@ -180,7 +180,7 @@ HEAD is 51168c5eee2e35348d9006f0b2fab3dc6e7c01cc merge-base is 0ed235ea2c17a19fc8238668653946721ed136fd ``` -- [ ] **Step 2: Verify LocalAI patch mirror invariant** +- [x] **Step 2: Verify LocalAI patch mirror invariant** Run: @@ -201,7 +201,7 @@ Expected: Exit code 0. ``` -- [ ] **Step 3: Write clean source provenance into Phase 0 results** +- [x] **Step 3: Write clean source provenance into Phase 0 results** Update `GB10_PARITY_PHASE0_RESULTS.md`: @@ -215,7 +215,7 @@ Update `GB10_PARITY_PHASE0_RESULTS.md`: - LocalAI patch mirror: applies cleanly and tree-matches fork HEAD. ``` -- [ ] **Step 4: Commit Task 2** +- [x] **Step 4: Commit Task 2** Run: @@ -237,7 +237,7 @@ Commit succeeds. **Files:** - Modify: `backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md` -- [ ] **Step 1: Extract the current artifact-backed numbers** +- [x] **Step 1: Extract the current artifact-backed numbers** Run: @@ -261,7 +261,7 @@ Expected: existing_artifact_extract.txt is created and shows CDEF, paged highN, and vLLM highN evidence. ``` -- [ ] **Step 2: Update Phase 0 results with artifact gaps** +- [x] **Step 2: Update Phase 0 results with artifact gaps** Add: @@ -276,7 +276,7 @@ Add: `51168c5ee`; this must be separated from current production-source baselines. ``` -- [ ] **Step 3: Commit Task 3** +- [x] **Step 3: Commit Task 3** Run: @@ -298,7 +298,7 @@ Commit succeeds. **Files:** - Modify: `backend/cpp/llama-cpp-localai-paged/docs/GB10_PARITY_PHASE0_RESULTS.md` -- [ ] **Step 1: Re-run DGX preflight immediately before build** +- [x] **Step 1: Re-run DGX preflight immediately before build** Run: @@ -316,7 +316,7 @@ Expected: Exit code 0. ``` -- [ ] **Step 2: Start a detached clean build** +- [x] **Step 2: Start a detached clean build** Run: @@ -355,7 +355,13 @@ Expected: Command returns quickly and writes build_clean.pid. ``` -- [ ] **Step 3: Poll build completion** +- [x] **Step 3: Poll build completion** + +Note: first build attempt started as PID `625392` and failed during CMake +configure because `nvcc` was not on PATH. DGX has +`/usr/local/cuda-13.0/bin/nvcc`; retry uses explicit `CUDACXX`. + +Retry build attempt started as PID `631100` and completed successfully. Run: @@ -384,7 +390,7 @@ Expected: DONE ``` -- [ ] **Step 4: Run canonical md5 gates** +- [x] **Step 4: Run canonical md5 gates** Run: @@ -406,7 +412,7 @@ MoE md5 is 8cb0ce23777bf55f92f63d0292c756b0 Dense md5 is 5951a5b4d624ce891e22ab5fca9bc439 ``` -- [ ] **Step 5: Update Phase 0 results and commit** +- [x] **Step 5: Update Phase 0 results and commit** Add build SHA, binary mtimes, gate md5s, and whether they matched expectations.