mirror of
https://github.com/mudler/LocalAI.git
synced 2026-04-30 20:15:35 -04:00
fix(vision): propagate mtmd media marker from backend via ModelMetadata (#9412)
Upstream llama.cpp (PR #21962) switched the server-side mtmd media marker to a random per-server string and removed the legacy "<__media__>" backward-compat replacement in mtmd_tokenizer. The Go layer still emitted the hardcoded "<__media__>", so on the non-tokenizer-template path the prompt arrived with a marker mtmd did not recognize and tokenization failed with "number of bitmaps (1) does not match number of markers (0)". Report the active media marker via ModelMetadataResponse.media_marker and substitute the sentinel "<__media__>" with it right before the gRPC call, after the backend has been loaded and probed. Also skip the Go-side multimodal templating entirely when UseTokenizerTemplate is true — llama.cpp's oaicompat_chat_params_parse already injects its own marker and StringContent is unused in that path. Backends that do not expose the field keep the legacy "<__media__>" behavior.
This commit is contained in:
committed by
GitHub
parent
ad742738cb
commit
7809c5f5d0
@@ -2814,6 +2814,13 @@ public:
|
||||
return grpc::Status(grpc::StatusCode::FAILED_PRECONDITION, "Model not loaded");
|
||||
}
|
||||
|
||||
// Report the active multimodal media marker so the Go layer can emit the
|
||||
// same string when rendering prompts outside the tokenizer-template path.
|
||||
// Only meaningful when an mtmd context was initialized (vision/audio models).
|
||||
if (ctx_server.impl->mctx != nullptr) {
|
||||
response->set_media_marker(get_media_marker());
|
||||
}
|
||||
|
||||
// Check if chat templates are initialized
|
||||
if (ctx_server.impl->chat_params.tmpls == nullptr) {
|
||||
// If templates are not initialized, we can't detect thinking support
|
||||
|
||||
Reference in New Issue
Block a user