mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-11 18:27:32 -04:00
spiritbuun/buun-llama-cpp is a fork of TheTom/llama-cpp-turboquant that adds two independent features on top: DFlash block-diffusion speculative decoding (via a dedicated DFlashDraftModel GGUF arch) and two extra TCQ KV-cache variants (turbo2_tcq, turbo3_tcq) on top of TurboQuant's turbo2/turbo3/turbo4. Follows the turboquant thin-wrapper pattern — reuses backend/cpp/llama-cpp grpc-server sources verbatim, patches only the build copy to extend the KV allow-list and wire up buun-exclusive tree_budget / draft_topk options. DraftModel is already wired end-to-end (proto field 39 → params.speculative), so DFlash activation only needs the existing options passthrough (spec_type:dflash) plus the drafter path in draft_model. CacheTypeOptions now surfaces the five turbo* values so the React UI dropdown shows them — benefits turboquant too (previously users had to type them in YAML manually). Assisted-by: Claude:Opus-4.7 [Read] [Edit] [Bash] [WebFetch] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
51 lines
1.3 KiB
Bash
Executable File
51 lines
1.3 KiB
Bash
Executable File
#!/bin/bash
|
|
# Apply the buun-llama-cpp patch series to a cloned buun-llama-cpp checkout.
|
|
#
|
|
# buun-llama-cpp is a fork-of-a-fork that branched off upstream llama.cpp
|
|
# before some API changes the shared backend/cpp/llama-cpp/grpc-server.cpp
|
|
# depends on. We carry those upstream commits as patch files under
|
|
# backend/cpp/buun-llama-cpp/patches/ and apply them here so the reused
|
|
# grpc-server source compiles against the fork unmodified.
|
|
#
|
|
# Drop the corresponding patch from patches/ whenever the fork catches up with
|
|
# upstream — the build will fail fast if a patch stops applying, which is the
|
|
# signal to retire it.
|
|
|
|
set -euo pipefail
|
|
|
|
if [[ $# -ne 2 ]]; then
|
|
echo "usage: $0 <llama.cpp-src-dir> <patches-dir>" >&2
|
|
exit 2
|
|
fi
|
|
|
|
SRC_DIR=$1
|
|
PATCHES_DIR=$2
|
|
|
|
if [[ ! -d "$SRC_DIR" ]]; then
|
|
echo "source dir does not exist: $SRC_DIR" >&2
|
|
exit 2
|
|
fi
|
|
|
|
if [[ ! -d "$PATCHES_DIR" ]]; then
|
|
echo "no patches dir at $PATCHES_DIR, nothing to apply"
|
|
exit 0
|
|
fi
|
|
|
|
shopt -s nullglob
|
|
patches=("$PATCHES_DIR"/*.patch)
|
|
shopt -u nullglob
|
|
|
|
if [[ ${#patches[@]} -eq 0 ]]; then
|
|
echo "no .patch files in $PATCHES_DIR, nothing to apply"
|
|
exit 0
|
|
fi
|
|
|
|
cd "$SRC_DIR"
|
|
|
|
for patch in "${patches[@]}"; do
|
|
echo "==> applying $patch"
|
|
git apply --verbose "$patch"
|
|
done
|
|
|
|
echo "all buun-llama-cpp patches applied successfully"
|