Files
LocalAI/docs
Ettore Di Giacinto 4cd90bfae9 paged: drop bf16-tau (patch 0026), subsumed by decode fusions (tau=100000 flat, zero speed benefit)
The opt-in hybrid per-head bf16 SSM-state lever (ssm_bf16_tau, patch 0026) is
removed from the llama-cpp-localai-paged patch series. Clean re-measurement after
the decode fusions (0028 recurrent-state gather-fusion + 0029 block-table cache)
landed shows it buys nothing: forcing ALL gated-DeltaNet heads to bf16
(tau=100000, the most aggressive setting) gives flat decode throughput, 780.6 vs
780.0 t/s. The mode engages but adds zero speed because it is subsumed by the
fusions. The earlier "+12%" was measured before the fusions completed. bf16-tau
was a precision trade (not bit-exact, ~91% same-top-p) plus extra bug surface and
extra CUDA template-instantiation compile cost with no offsetting benefit.

Dependency check: no later patch (0028/0029/0030) depends on 0026. 0030's only
mention is a description comment; its code keys off fused_gdn_ar/ch/auto_fgdn,
which originate in 0018/0019/0021 (before 0026). The remaining series (0001-0025,
0028-0030) applies clean with git apply --check against the pin
0ed235ea2c17a19fc8238668653946721ed136fd. The Makefile applies the series by glob
(patches/paged/0*.patch); the resulting gap at 0026 is tolerated (0005/0027 are
already absent).

Removed:
- patches/paged/0026-qwen35-hybrid-perhead-ssm-state.patch
- the dead ssm_bf16_tau / ssm_hybrid_tau option handler in the shared
  grpc-server.cpp (it only set LLAMA_SSM_BF16_TAU, now a no-op the library no
  longer reads)
- the patched+bf16-tau benchmark columns and llama-patched-bf16tau rows
  (README + final_benchmark.csv), the ssm_bf16_tau option text in backend
  index.yaml, the gallery NOTE block, and the docs/features/backends.md mention.

The rejected-lever lesson is kept (why it was dropped: subsumed, tau=100000 flat)
in the backend README section 5, the paged-backend agent guide, and the
vLLM-parity methodology, so it is not re-tried.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-28 16:06:06 +00:00
..
2025-11-19 22:25:33 +01:00

LocalAI website

LocalAI documentation website

Requirement

In this project, the Docsy theme component is pulled in as a Hugo module, together with other module dependencies:

$ hugo mod graph
hugo: collected modules in 566 ms
hugo: collected modules in 578 ms
github.com/google/docsy-example github.com/google/docsy@v0.5.1-0.20221017155306-99eacb09ffb0
github.com/google/docsy-example github.com/google/docsy/dependencies@v0.5.1-0.20221014161617-be5da07ecff1
github.com/google/docsy/dependencies@v0.5.1-0.20221014161617-be5da07ecff1 github.com/twbs/bootstrap@v4.6.2+incompatible
github.com/google/docsy/dependencies@v0.5.1-0.20221014161617-be5da07ecff1 github.com/FortAwesome/Font-Awesome@v0.0.0-20220831210243-d3a7818c253f

If you want to do SCSS edits and want to publish these, you need to install PostCSS

npm install

Running the website locally

Building and running the site locally requires a recent extended version of Hugo. You can find out more about how to install Hugo for your environment in our Getting started guide.

Once you've made your working copy of the site repo, from the repo root folder, run:

hugo server

Running a container locally

You can run docsy-example inside a Docker container, the container runs with a volume bound to the docsy-example folder. This approach doesn't require you to install any dependencies other than Docker Desktop on Windows and Mac, and Docker Compose on Linux.

  1. Build the docker image

    docker-compose build
    
  2. Run the built image

    docker-compose up
    

    NOTE: You can run both commands at once with docker-compose up --build.

  3. Verify that the service is working.

    Open your web browser and type http://localhost:1313 in your navigation bar, This opens a local instance of the docsy-example homepage. You can now make changes to the docsy example and those changes will immediately show up in your browser after you save.

Cleanup

To stop Docker Compose, on your terminal window, press Ctrl + C.

To remove the produced images run:

docker-compose rm

For more information see the Docker Compose documentation.

Troubleshooting

As you run the website locally, you may run into the following error:

➜ hugo server

INFO 2021/01/21 21:07:55 Using config file: 
Building sites … INFO 2021/01/21 21:07:55 syncing static files to /
Built in 288 ms
Error: Error building site: TOCSS: failed to transform "scss/main.scss" (text/x-scss): resource "scss/scss/main.scss_9fadf33d895a46083cdd64396b57ef68" not found in file cache

This error occurs if you have not installed the extended version of Hugo. See this section of the user guide for instructions on how to install Hugo.

Or you may encounter the following error:

➜ hugo server

Error: failed to download modules: binary with name "go" not found

This error occurs if you have not installed the go programming language on your system. See this section of the user guide for instructions on how to install go.