From ed648b3b4e1be1258e0ae8ba69235619412cad4a Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 23 Apr 2026 14:59:39 +0200 Subject: [PATCH] fix(llama-cpp): include server-chat.cpp in grpc-server translation unit (#9511) * fix(llama-cpp): include server-chat.cpp in grpc-server translation unit Upstream llama.cpp refactor (ggml-org/llama.cpp#20690) moved the OAI/Anthropic/Responses and transcription conversion helpers out of server-common.cpp into a new server-chat.cpp, and server-task.cpp and server-context.cpp now call those symbols (convert_transcriptions_to_chatcmpl, server_chat_convert_responses_to_chatcmpl, server_chat_convert_anthropic_to_oai, server_chat_msg_diff_to_json_oaicompat) via server-chat.h. grpc-server.cpp builds as a single translation unit by #include-ing the upstream .cpp files directly. Without including server-chat.cpp, the declarations are satisfied at compile time via server-chat.h but the link step fails with undefined references once LLAMA_VERSION crosses the refactor commit (134d6e54). Guard the include with __has_include so the same source stays buildable on older LLAMA_VERSION pins that predate the refactor (where prepare.sh won't copy server-chat.cpp into tools/grpc-server/). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto * chore(llama-cpp): bump LLAMA_VERSION to 0d0764dfd Bump to ggml-org/llama.cpp@0d0764dfd257c0ae862525c05778207f87b99b1c. Paired with the preceding grpc-server server-chat.cpp include so the refactor at 134d6e54 links cleanly. Supersedes PR #9494. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- backend/cpp/llama-cpp/Makefile | 2 +- backend/cpp/llama-cpp/grpc-server.cpp | 8 ++++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/backend/cpp/llama-cpp/Makefile b/backend/cpp/llama-cpp/Makefile index 599ced868..021c65657 100644 --- a/backend/cpp/llama-cpp/Makefile +++ b/backend/cpp/llama-cpp/Makefile @@ -1,5 +1,5 @@ -LLAMA_VERSION?=5a4cd6741fc33227cdacb329f355ab21f8481de2 +LLAMA_VERSION?=0d0764dfd257c0ae862525c05778207f87b99b1c LLAMA_REPO?=https://github.com/ggerganov/llama.cpp CMAKE_ARGS?= diff --git a/backend/cpp/llama-cpp/grpc-server.cpp b/backend/cpp/llama-cpp/grpc-server.cpp index a0ef198e0..9869ef2cc 100644 --- a/backend/cpp/llama-cpp/grpc-server.cpp +++ b/backend/cpp/llama-cpp/grpc-server.cpp @@ -10,6 +10,14 @@ #include "server-task.cpp" #include "server-queue.cpp" #include "server-common.cpp" +// server-chat.cpp exists only in llama.cpp after the upstream refactor that +// split OAI/Anthropic/Responses/transcription conversion helpers out of +// server-common.cpp. When present, server-context.cpp and server-task.cpp +// above call into it, so we must pull its definitions into this TU or the +// link fails. __has_include keeps the source compatible with older pins. +#if __has_include("server-chat.cpp") +#include "server-chat.cpp" +#endif #include "server-context.cpp" // LocalAI