Commit Graph

3 Commits

Author SHA1 Message Date
Ettore Di Giacinto
c67b511ac0 fix(ik-llama): port multimodal path to mtmd API and bump to f96eaddb (#10534)
The IK_LLAMA_VERSION bump to f96eaddba8bed6a9a5e628bbf6a566775c70b49c pulls in
upstream commit "Prune examples/llava", which deletes examples/llava (clip.* /
llava.*). The ik-llama backend's grpc-server.cpp built a local `myclip` library
from those files and called the removed clip/llava C API, so the bump no longer
builds.

ik_llama keeps its multimodal stack in the surviving `mtmd` library
(examples/mtmd/, public headers mtmd.h + mtmd-helper.h). This ports the backend's
multimodal path onto the high-level mtmd_* / mtmd_helper_* API in place, leaving
the text path (which still uses ik_llama's retained old common API) untouched:

- Makefile: bump IK_LLAMA_VERSION to f96eaddb.
- prepare.sh: drop the clip/llava source copy + sed block; mtmd is a library
  target, no source copy needed.
- CMakeLists.txt: remove the `myclip` target; link `mtmd` and add its include
  dir; build grpc-server as C++17 (mtmd headers require it).
- patches: drop 0002 (targeted the deleted examples/llava/clip.cpp; the mtmd
  clip.cpp never calls ggml_quantize_chunk, so the fix is unneeded). Keep 0001
  (verified still applies).
- grpc-server.cpp / utils.hpp: replace clip_model_load + clip_image_load_from_bytes
  + llava_image_embed_make_with_clip_img + the manual [img-N] prefix splitting and
  per-image llava_embd_batch decode loop with mtmd_init_from_file (moved after the
  model load, which it requires), mtmd_helper_bitmap_init_from_buf, mtmd_tokenize
  and mtmd_helper_eval_chunks. Legacy [img-N] tags are translated, in order, into
  mtmd media markers (mtmd_default_marker()); the post-image suffix text stays on
  the normal token path so the sampling loop is unchanged.

Supersedes #10534.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
2026-06-27 22:57:30 +00:00
Ettore Di Giacinto
c0920f3273 fix(ik-llama-cpp): patch clip.cpp for new ggml_quantize_chunk signature (#9531)
Bumps ik_llama.cpp pin to 16996aeab7. Upstream 286ce32...16996ae adds a
trailing `const struct quantize_user_data *` parameter to
`ggml_quantize_chunk` (PR ikawrakow/ik_llama.cpp#1677) but leaves
`examples/llava/clip.cpp` unchanged because their build has moved to
`examples/mtmd/`. LocalAI's prepare.sh still copies from
`examples/llava/`, so the dead 7-arg call reaches the grpc-server
compile and fails. Patch the call site to pass `nullptr` for the new
param.

Assisted-by: Claude:Opus-4.7 [Read] [Edit] [Bash]
2026-04-24 13:07:26 +02:00
Ettore Di Giacinto
9ca03cf9cc feat(backends): add ik-llama-cpp (#9326)
* feat(backends): add ik-llama-cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add grpc e2e suite, hook to CI, update README

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-04-12 13:51:28 +02:00