The llama.cpp backend already accepts a free-form options: array in the
model config that maps to common_params fields, but a coverage audit
against upstream pin 7f3f843c flagged 12 user-visible knobs that were
neither set via the typed proto fields nor reachable via options:.
Wire them up under the existing if/else chain in params_parse, before
the speculative section. Each new option follows the file's prevailing
patterns (try/catch around numeric parses, the same true/1/yes/on bool
form used elsewhere, hardware_concurrency() fallback for thread counts,
mirror of draft_override_tensor for override_tensor).
Top-level / batching / IO:
- n_ubatch (alias ubatch) -- physical batch size; was previously
force-aliased to n_batch at line 482, blocking embedding/rerank
workloads that need independent control
- threads_batch (alias n_threads_batch) -- main-model batch threads;
mirrors the existing draft_threads_batch
- direct_io (alias use_direct_io) -- O_DIRECT model loads
- verbosity -- llama.cpp log threshold (line 479 had this commented
out)
- override_tensor (alias tensor_buft_overrides) -- per-tensor buffer
overrides for the main model; mirrors draft_override_tensor
Embedding / multimodal:
- pooling_type (alias pooling) -- mean/cls/last/rank/none; previously
only auto-flipped to RANK for rerankers
- embd_normalize (alias embedding_normalize) -- and the embedding
handler now reads params_base.embd_normalize instead of a hardcoded
2 at the previous embd_normalize literal in Embedding()
- mmproj_use_gpu (alias mmproj_offload) -- mmproj on CPU vs GPU
- image_min_tokens / image_max_tokens -- per-image vision token budget
Reasoning surface (the audit-focus three; LocalAI's existing
ReasoningConfig.DisableReasoning only feeds the per-request
chat_template_kwargs.enable_thinking and does not touch any of these):
- reasoning_format -- none/auto/deepseek/deepseek-legacy parser
- enable_reasoning (alias reasoning_budget) -- -1/0/>0 thinking budget
- prefill_assistant -- trailing-assistant-message prefill toggle
All 14 referenced fields exist on both the upstream pin and the
turboquant fork's common.h, so no LOCALAI_LEGACY_LLAMA_CPP_SPEC guard
is needed.
Docs: extend model-configuration.md with new "Reasoning Models",
"Multimodal Backend Options", "Embedding & Reranking Backend Options",
and "Other Backend Tuning Options" subsections; also refresh the
Speculative Type Values table to show the new dash-separated canonical
names alongside the underscore aliases LocalAI still accepts.
Assisted-by: claude-code:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
LocalAI website
LocalAI documentation website
Requirement
In this project, the Docsy theme component is pulled in as a Hugo module, together with other module dependencies:
$ hugo mod graph
hugo: collected modules in 566 ms
hugo: collected modules in 578 ms
github.com/google/docsy-example github.com/google/docsy@v0.5.1-0.20221017155306-99eacb09ffb0
github.com/google/docsy-example github.com/google/docsy/dependencies@v0.5.1-0.20221014161617-be5da07ecff1
github.com/google/docsy/dependencies@v0.5.1-0.20221014161617-be5da07ecff1 github.com/twbs/bootstrap@v4.6.2+incompatible
github.com/google/docsy/dependencies@v0.5.1-0.20221014161617-be5da07ecff1 github.com/FortAwesome/Font-Awesome@v0.0.0-20220831210243-d3a7818c253f
If you want to do SCSS edits and want to publish these, you need to install PostCSS
npm install
Running the website locally
Building and running the site locally requires a recent extended version of Hugo.
You can find out more about how to install Hugo for your environment in our
Getting started guide.
Once you've made your working copy of the site repo, from the repo root folder, run:
hugo server
Running a container locally
You can run docsy-example inside a Docker
container, the container runs with a volume bound to the docsy-example
folder. This approach doesn't require you to install any dependencies other
than Docker Desktop on
Windows and Mac, and Docker Compose
on Linux.
-
Build the docker image
docker-compose build -
Run the built image
docker-compose upNOTE: You can run both commands at once with
docker-compose up --build. -
Verify that the service is working.
Open your web browser and type
http://localhost:1313in your navigation bar, This opens a local instance of the docsy-example homepage. You can now make changes to the docsy example and those changes will immediately show up in your browser after you save.
Cleanup
To stop Docker Compose, on your terminal window, press Ctrl + C.
To remove the produced images run:
docker-compose rm
For more information see the Docker Compose documentation.
Troubleshooting
As you run the website locally, you may run into the following error:
➜ hugo server
INFO 2021/01/21 21:07:55 Using config file:
Building sites … INFO 2021/01/21 21:07:55 syncing static files to /
Built in 288 ms
Error: Error building site: TOCSS: failed to transform "scss/main.scss" (text/x-scss): resource "scss/scss/main.scss_9fadf33d895a46083cdd64396b57ef68" not found in file cache
This error occurs if you have not installed the extended version of Hugo. See this section of the user guide for instructions on how to install Hugo.
Or you may encounter the following error:
➜ hugo server
Error: failed to download modules: binary with name "go" not found
This error occurs if you have not installed the go programming language on your system.
See this section of the user guide for instructions on how to install go.