mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-29 19:19:19 -04:00
Some Gemma 3 GGUF files distributed via the Ollama registry omit the
`gemma3.attention.layer_norm_rms_epsilon` metadata key. Both llama.cpp
and ik_llama.cpp treat that key as required and abort the load with:
error loading model hyperparameters:
key not found in model: gemma3.attention.layer_norm_rms_epsilon
Ollama's loader silently falls back to ~1e-6 in the same situation,
which is the canonical Gemma 3 default (google/gemma_pytorch config.py
and the Hugging Face Gemma3Config), and the model loads correctly.
Add small build-time patches to both backends that pre-seed
`hparams.f_norm_rms_eps` with 1e-6 and mark the metadata lookup as
optional. GGUFs that already carry the key continue to use the embedded
value unchanged.
Closes #9414