Also test for regressions in HTTP GET API key exempted endpoints because
this list can get out of sync with the UI routes.
Also fix support for proxying on a different prefix both server and
client side.
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat(realtime): WebRTC support
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(tracing): Show full LLM opts and deltas
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Otherwise if using collections with postgresql we create a deadlock, as
we need embeddings to be up
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: include model name in mmproj file path to prevent model isolation issues
This fix addresses issue #8937 where different models with mmproj files
having the same filename (e.g., mmproj-F32.gguf) would overwrite each other.
By including the model name in the path (llama-cpp/mmproj/<model-name>/<filename>),
each model's mmproj files are now stored in separate directories, preventing
the collision that caused conversations to fail when switching between models.
Fixes#8937
Signed-off-by: LocalAI Bot <localai-bot@example.com>
* test: update test expectations for model name in mmproj path
The test file had hardcoded expectations for the old mmproj path format.
Updated the test expectations to include the model name subdirectory
to match the new path structure introduced in the fix.
Fixes CI failures on tests-apple and tests-linux
* fix: add model name to model path for consistency with mmproj path
This change makes the model path consistent with the mmproj path by
including the model name subdirectory in both paths:
- mmproj: llama-cpp/mmproj/<model-name>/<filename>
- model: llama-cpp/models/<model-name>/<filename>
This addresses the reviewer's feedback that the model config generation
needs to correctly reference the mmproj file path.
Fixes the issue where the model path didn't include the model name
subdirectory while the mmproj path did.
Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
---------
Signed-off-by: LocalAI Bot <localai-bot@example.com>
Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
Co-authored-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
* fix: add missing bufio.Flush in processImageFile
The processImageFile function writes decoded image data (from base64
or URL download) through a bufio.NewWriter but never calls Flush()
before closing the underlying file. Since bufio's default buffer is
4096 bytes, small images produce 0-byte files and large images are
truncated — causing PIL to fail with "cannot identify image file".
This breaks all image input paths: file, files, and ref_images
parameters in /v1/images/generations, making img2img, inpainting,
and reference image features non-functional.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: merge options into kwargs in diffusers GenerateImage
The GenerateImage method builds a local `options` dict containing the
source image (PIL), negative_prompt, and num_inference_steps, but
never merges it into `kwargs` before calling self.pipe(**kwargs).
This causes img2img to fail with "Input is in incorrect format"
because the pipeline never receives the image parameter.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: add unit test for processImageFile base64 decoding
Verifies that a base64-encoded PNG survives the write path
(encode → decode → bufio.Write → Flush → file on disk) with
byte-for-byte fidelity. The test image is small enough to fit
entirely in bufio's 4096-byte buffer, which is the exact scenario
where the missing Flush() produced a 0-byte file.
Also tests that invalid base64 input is handled gracefully.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: verify GenerateImage merges options into pipeline kwargs
Mocks the diffusers pipeline and calls GenerateImage with a source
image and negative prompt. Asserts that the pipeline receives the
image, negative_prompt, and num_inference_steps via kwargs — the
exact parameters that were silently dropped before the fix.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: move kwargs.update(options) earlier in GenerateImage
Move the options merge right after self.options merge (L742) so that
image, negative_prompt, and num_inference_steps are available to all
downstream code paths including img2vid and txt2vid.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: convert processImageFile tests to ginkgo
Replace standard testing with ginkgo/gomega to be consistent with
the rest of the test suites in the project.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
---------
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
feat: standardize CLI flag naming to kebab-case with backwards compatibility
- Rename --p2ptoken to --p2p-token for consistency
- Add deprecation alias for old --p2ptoken flag
- Fix broken name tag in config check command
- Add runtime deprecation warning system (core/cli/deprecations.go)
- Document kebab-case naming convention in code comments
- Maintain full backwards compatibility via kong aliases
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
feat: redesign explorer and models pages with react-ui theme
- Updated logo and branding to match LocalAI's current design
- Applied react-ui color scheme and CSS variables throughout
- Added grid/list view toggle for models page
- Implemented enhanced filter chips with active state highlighting
- Added sort options and improved pagination
- Redesigned explorer page cards and token display
- Modernized navbar styling with sticky positioning
- Improved modal design with inline actions
- Ensured mobile-responsive design maintained
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
* feat(mlx-distributed): add new MLX-distributed backend
Add new MLX distributed backend with support for both TCP and RDMA for
model sharding.
This implementation ties in the discovery implementation already in
place, and re-uses the same P2P mechanism for the TCP MLX-distributed
inferencing.
The Auto-parallel implementation is inspired by Exo's
ones (who have been added to acknowledgement for the great work!)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose a CLI to facilitate backend starting
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: make manual rank0 configurable via model configs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add missing features from mlx backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
feat: add --data-path CLI flag for persistent data separation
- Add LOCALAI_DATA_PATH environment variable and --data-path CLI flag
- Default data path: /data (separate from configuration directory)
- Automatic migration on startup: moves agent_tasks.json, agent_jobs.json, collections/, and assets/ from old config dir to new data path
- Backward compatible: preserves old behavior if LOCALAI_DATA_PATH is not set
- Agent state and job directories now use DataPath with proper fallback chain
- Update documentation with new flag and docker-compose example
This separates mutable persistent data (collectiondb, agents, assets, skills) from configuration files, enabling better volume mounting and data persistence in containerized deployments.
Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
feat: add tabs to System view for Models and Backends
- Split System view into two tabs: Models and Backends
- Use URL search params and localStorage for tab state persistence
- Optimize API calls to only fetch data for active tab
- Add tab counts in labels showing number of items
- Use existing tab CSS patterns from the codebase
- Maintain all existing functionality with improved UX
Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
* feat(functions): add peg-based parsing
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: support returning toolcalls directly from backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: do run PEG only if backend didn't send deltas
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
- Add 'Events' column header between 'Status' and 'Actions'
- Fetch observable counts for each agent using /api/agents/<name>/observables
- Display events count as clickable link navigating to agent status page
- Events count updates every 5 seconds with agent refresh interval
- Shows '0' if API call fails for an agent
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
feat: add shell completion support for bash, zsh, and fish
- Add core/cli/completion.go with dynamic completion script generation
- Add core/cli/completion_test.go with unit tests
- Modify cmd/local-ai/main.go to support completion command
- Modify core/cli/cli.go to add Completion subcommand
- Add docs/content/features/shell-completion.md with installation instructions
The completion scripts are generated dynamically from the Kong CLI model,
so they automatically include all commands, subcommands, and flags.
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
* feat: add standalone and agentic functionalities
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose agents via responses api
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: Add LOCALAI_DISABLE_MCP environment variable to disable MCP support
- Added DisableMCP field to RunCMD struct in core/cli/run.go
- Added LOCALAI_DISABLE_MCP environment variable support
- Added DisableMCP field to ApplicationConfig struct
- Added DisableMCP AppOption function
- Updated MCP endpoint routing to check appConfig.DisableMCP
- When LOCALAI_DISABLE_MCP is set to true/1/yes, MCP endpoints are not registered
When set, all MCP functionality is disabled and appropriate error messages
are returned to users.
Use Cases:
- Security-conscious deployments where MCP is not needed
- Reducing attack surface
- Compliance requirements that prohibit certain protocol support
Environment variable: LOCALAI_DISABLE_MCP=true
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
* docs: Add documentation for LOCALAI_DISABLE_MCP environment variable
- Add section explaining how to disable MCP support using environment variable
- Document use cases for disabling MCP
- Provide examples for CLI and Docker usage
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
---------
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
* fix: Add timeout-based wait for model deletion completion
- Replace simple polling loop with context-based timeout (5 minutes)
- Use select statement for cleaner timeout handling
- Added proper logging for timeout case
- This addresses the code review comment about using context with timeout instead of dangerous polling approach
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* fix: replace goto statements with break in model deletion loop (fixes CI compilation error)
Signed-off-by: LocalAI [bot] <localai-bot@noreply.github.com>
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: LocalAI [bot] <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: LocalAI [bot] <localai-bot@noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* feat: Rename 'Whisper' model type to 'STT' in UI
- Updated models.html: Changed 'Whisper' filter button to 'STT'
- Updated talk.html: Changed 'Whisper Model' to 'STT Model'
- Updated backends.html: Changed 'Whisper' to 'STT'
- Updated talk.js: Renamed getWhisperModel() to getSTTModel(),
sendAudioToWhisper() to sendAudioToSTT(), and whisperModelSelect to sttModelSelect
This change makes the UI more consistent with the model category naming,
where all speech-to-text models (including Whisper, Parakeet, Moonshine,
WhisperX, etc.) are grouped under the 'STT' (Speech-to-Text) category.
Fixes#8776
Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
* Rename whisperModelSelect to sttModelSelect in talk.html
As requested by maintainer mudler in PR review, replacing all
whisperModelSelect occurrences with sttModelSelect since the
model type was renamed from Whisper to STT.
Signed-off-by: LocalAI [bot] <localai-bot@users.noreply.github.com>
---------
Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
Signed-off-by: LocalAI [bot] <localai-bot@users.noreply.github.com>
Co-authored-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
Co-authored-by: LocalAI [bot] <localai-bot@users.noreply.github.com>
fix: Add vllm-omni backend to video generation model detection
- Include vllm-omni in the list of backends that support FLAG_VIDEO
- This allows models like vllm-omni-wan2.2-t2v to appear in the video model selector UI
- Fixes issue #8659 where video generation models using vllm-omni backend were not showing in the dropdown
Co-authored-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
fix: return full embedding dimensions instead of truncating trailing zeros
- Remove the logic that strips trailing zeros from embeddings
- Trailing zeros may be valid values in some embedding models
- This fixes the issue where embeddings like jina-v3 returned
only 1/4 of their native dimensions (256 instead of 1024)
- The truncation was causing vector database dimension mismatch errors
- Fixes issue #8721
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
When a model is configured with 'known_usecases: [rerank]' in the YAML
config, the reranking endpoint was not being matched because:
1. The GuessUsecases function only checked for backend == 'rerankers'
2. The syncKnownUsecasesFromString() was not being called when loading
configs via yaml.Unmarshal in readModelConfigsFromFile
This fix:
1. Updates GuessUsecases to also check if Reranking is explicitly set to
true in the model config (in addition to checking backend type)
2. Adds syncKnownUsecasesFromString() calls after yaml.Unmarshal in
readModelConfigsFromFile to ensure known_usecases are properly parsed
Fixes#8658
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
fix: Implement responsive line wrapping for model names on home page
- Changed model name display from truncate to break-words
- Increased max-width from 100px to 200px to allow more text
- This fixes issue #8209 for responsive text wrapping on smaller screens
Fixes: #8209
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
* debug
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* retry instead of re-computing a response
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Add model storage size display and RAM warning in Models tab
- Backend (ui_api.go):
- Added getDirectorySize() helper function to calculate total size of model files
- Added storageSize, ramTotal, ramUsed, ramUsagePercent to /api/models endpoint response
- Uses xsysinfo.GetSystemRAMInfo() for RAM information
- Frontend (models.html):
- Added storageSize, ramTotal, ramUsed, ramUsagePercent to Alpine.js data object
- Added formatBytes() helper for human-readable byte formatting
- Display storage size in hero header with blue indicator
- Show warning banner when storage exceeds RAM (model too large for system)
Addresses: https://github.com/mudler/LocalAI/issues/6251
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
fix(video): initialize model selection dropdown with current model value
The Alpine.js link variable was starting empty, causing the dropdown
selection to not reflect the currently selected model. This fix initializes
the link variable with the current model value from the template (e.g.,
video/{{.Model}}), following the same pattern used in image.html.
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
When a backend download fails (e.g., on Mac OS with port conflicts causing
connection issues), the backend directory is left with partial files.
This causes subsequent installation attempts to fail with 'run file not
found' because the sanity check runs on an empty/partial directory.
This fix cleans up the backend directory when the initial download fails
before attempting fallback URIs or mirrors. This ensures a clean state
for retry attempts.
Fixes: #8016
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
* fix(gallery): add fallback URI resolution for backend installation
When a backend installation fails (e.g., due to missing 'latest-' tag),
try fallback URIs in order:
1. Replace 'latest-' with 'master-' in the URI
2. If that fails, append '-development' to the backend name
This fixes the issue where backend index entries don't match the
repository tags. For example, installing 'ace-step' tries to download
'latest-gpu-nvidia-cuda-13-ace-step' but only 'master-gpu-nvidia-cuda-13-ace-step'
exists in the quay.io registry.
Fixes: #8437
Signed-off-by: localai-bot <139863280+localai-bot@users.noreply.github.com>
* chore(gallery): make fallback URI patterns configurable via env vars
---------
Signed-off-by: localai-bot <139863280+localai-bot@users.noreply.github.com>
Closes#8119
When installing models from the gallery, files are created with 0600
permissions (owner read/write only), making them unreadable by the
LocalAI server when running as a different user.
This fix changes the permissions to 0644 (owner read/write, group/others
read), allowing the server to read model files regardless of the user
it runs as.
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
fix: reload model configuration after editing (issue #8647)
- Add *model.ModelLoader parameter to EditModelEndpoint
- Call ml.ShutdownModel() after saving config to unload the running model
- Model will be reloaded on next inference request with new settings (e.g., context_size)
- Update route registration to pass ml to EditModelEndpoint
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>