mirror of
https://github.com/ollama/ollama.git
synced 2026-06-02 13:24:53 -04:00
* llama-server followups Misc fixes for #16031 - Add back dropped ROCm build flag for multi-GPU support on windows - Fix amdhip64_*.dll version detection for "latest" selection - Fix embeddings API for consistent normalize behavior with prior versions * ci: set up for automated llama.cpp update testing * reduce batch for fa-disabled, and constrained vram * mlx: fix v3 load bug on m5 Imagegen was incorrectly loading v3 first. This DRYs out the loading code so imagegen gets the same new v4/v3 selection logic. * fix reload bug on embedding models * bump version * steer user how to enable iGPU when disabled