mirror of
https://github.com/ollama/ollama.git
synced 2026-02-23 02:40:16 -05:00
This change adds a new MLX based runner which includes: * Method-based MLX bindings * Subprocess-based MLX runner (x/mlxrunner) * KV cache with tree management * A basic sampler The GLM4-MoE-Lite model has been ported to use the new bindings. --------- Co-authored-by: Michael Yang <git@mxy.ng>
2.3 KiB
2.3 KiB