The IK_LLAMA_VERSION bump to f96eaddba8bed6a9a5e628bbf6a566775c70b49c pulls in
upstream commit "Prune examples/llava", which deletes examples/llava (clip.* /
llava.*). The ik-llama backend's grpc-server.cpp built a local `myclip` library
from those files and called the removed clip/llava C API, so the bump no longer
builds.
ik_llama keeps its multimodal stack in the surviving `mtmd` library
(examples/mtmd/, public headers mtmd.h + mtmd-helper.h). This ports the backend's
multimodal path onto the high-level mtmd_* / mtmd_helper_* API in place, leaving
the text path (which still uses ik_llama's retained old common API) untouched:
- Makefile: bump IK_LLAMA_VERSION to f96eaddb.
- prepare.sh: drop the clip/llava source copy + sed block; mtmd is a library
target, no source copy needed.
- CMakeLists.txt: remove the `myclip` target; link `mtmd` and add its include
dir; build grpc-server as C++17 (mtmd headers require it).
- patches: drop 0002 (targeted the deleted examples/llava/clip.cpp; the mtmd
clip.cpp never calls ggml_quantize_chunk, so the fix is unneeded). Keep 0001
(verified still applies).
- grpc-server.cpp / utils.hpp: replace clip_model_load + clip_image_load_from_bytes
+ llava_image_embed_make_with_clip_img + the manual [img-N] prefix splitting and
per-image llava_embd_batch decode loop with mtmd_init_from_file (moved after the
model load, which it requires), mtmd_helper_bitmap_init_from_buf, mtmd_tokenize
and mtmd_helper_eval_chunks. Legacy [img-N] tags are translated, in order, into
mtmd media markers (mtmd_default_marker()); the post-image suffix text stays on
the normal token path so the sampling loop is unchanged.
Supersedes #10534.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]