mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-23 16:20:01 -04:00
* chore: ignore local .worktrees directory Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(openai): stream usage non-zero when tools are enabled The streaming chat-completions worker for tool-bearing requests (processTools in core/http/endpoints/openai/chat.go) never forwarded the cumulative TokenUsage from ComputeChoices to the chunks it placed on the responses channel. The outer streaming loop's running usage tracker therefore stayed at the zero value, and the include_usage trailer reported {prompt_tokens:0, completion_tokens:0, total_tokens:0} whenever the request carried a `tools` array. Without tools, the alternative `process` path stamps Usage on every chunk, so that path was unaffected. Forward the final TokenUsage via a usage-only sentinel chunk (empty Choices, populated Usage) emitted right before close(responses). The outer loop's per-chunk Usage capture moves above the empty-Choices skip so the sentinel updates the tracker without ever reaching the wire, keeping the existing OpenAI spec contract (intermediate chunks carry no `usage` field, and the deferred-final-chunk helpers remain Usage-free per the regression test for issue #8546). Adds streamUsageFromTokenUsage, usageSentinelChunk, and applyChunkToUsage helpers with focused Ginkgo coverage plus a flow-level test that mirrors the outer-loop sequence. Fixes #9927 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4-7 [Claude Code] * refactor(openai): return final TokenUsage from stream workers Replace the usage-only sentinel SSE chunk introduced in the previous commit with a plain return value. The streaming workers process and processTools (now extracted as package-level processStream and processStreamWithTools) return (backend.TokenUsage, error); the outer ChatEndpoint loop reads the cumulative counts off the existing `ended` channel (now carrying streamWorkerResult{usage, err}) and builds the include_usage trailer from a normal Go value after the LOOP exits. This drops the empty-Choices "skip but capture Usage" rule from the outer loop and removes the usageSentinelChunk / applyChunkToUsage helpers entirely. The SSE responses channel is back to a single purpose: wire chunks only. processStream and processStreamWithTools move into chat_stream_workers.go so they can be exercised directly from tests. The chat_stream_usage_test.go suite now drives the workers with a mocked backend.ModelInferenceFunc and asserts on the returned TokenUsage. The regression coverage for issue #9927 is therefore behavioral: reverting the fix (discarding ComputeChoices' usage return) makes the assertions fail with concrete count mismatches. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4-7 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
83 lines
1.3 KiB
Plaintext
83 lines
1.3 KiB
Plaintext
# go-llama build artifacts
|
|
/sources/
|
|
__pycache__/
|
|
*.a
|
|
*.o
|
|
get-sources
|
|
prepare-sources
|
|
/backend/cpp/llama-cpp/grpc-server
|
|
/backend/cpp/llama-cpp/llama.cpp
|
|
/backend/cpp/llama-*
|
|
!backend/cpp/llama-cpp
|
|
/backends
|
|
/backend-images
|
|
/result.yaml
|
|
protoc
|
|
|
|
*.log
|
|
|
|
go-ggml-transformers
|
|
go-gpt2
|
|
whisper.cpp
|
|
/bloomz
|
|
go-bert
|
|
|
|
# LocalAI build binary
|
|
LocalAI
|
|
/local-ai
|
|
/local-ai-launcher
|
|
# prevent above rules from omitting the helm chart
|
|
!charts/*
|
|
# prevent above rules from omitting the api/localai folder
|
|
!api/localai
|
|
!core/**/localai
|
|
|
|
# Ignore models
|
|
models/*
|
|
test-models/
|
|
test-dir/
|
|
tests/e2e-aio/backends
|
|
mock-backend
|
|
|
|
release/
|
|
|
|
# just in case
|
|
.DS_Store
|
|
.idea
|
|
|
|
# Generated during build
|
|
backend-assets/*
|
|
!backend-assets/.keep
|
|
prepare
|
|
/ggml-metal.metal
|
|
docs/static/gallery.html
|
|
|
|
# Protobuf generated files
|
|
*.pb.go
|
|
*pb2.py
|
|
*pb2_grpc.py
|
|
|
|
# SonarQube
|
|
.scannerwork
|
|
|
|
# backend virtual environments
|
|
**/venv
|
|
|
|
# per-developer customization files for the development container
|
|
.devcontainer/customization/*
|
|
|
|
# React UI build artifacts (keep placeholder dist/index.html)
|
|
core/http/react-ui/node_modules/
|
|
core/http/react-ui/dist
|
|
|
|
# Extracted backend binaries for container-based testing
|
|
local-backends/
|
|
|
|
# UI E2E test artifacts
|
|
tests/e2e-ui/ui-test-server
|
|
core/http/react-ui/playwright-report/
|
|
core/http/react-ui/test-results/
|
|
|
|
# Local worktrees
|
|
.worktrees/
|