* feat(realtime): WebRTC support
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(tracing): Show full LLM opts and deltas
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat(mlx-distributed): add new MLX-distributed backend
Add new MLX distributed backend with support for both TCP and RDMA for
model sharding.
This implementation ties in the discovery implementation already in
place, and re-uses the same P2P mechanism for the TCP MLX-distributed
inferencing.
The Auto-parallel implementation is inspired by Exo's
ones (who have been added to acknowledgement for the great work!)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose a CLI to facilitate backend starting
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: make manual rank0 configurable via model configs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add missing features from mlx backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* feat: add standalone and agentic functionalities
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose agents via responses api
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: Add LOCALAI_DISABLE_MCP environment variable to disable MCP support
- Added DisableMCP field to RunCMD struct in core/cli/run.go
- Added LOCALAI_DISABLE_MCP environment variable support
- Added DisableMCP field to ApplicationConfig struct
- Added DisableMCP AppOption function
- Updated MCP endpoint routing to check appConfig.DisableMCP
- When LOCALAI_DISABLE_MCP is set to true/1/yes, MCP endpoints are not registered
When set, all MCP functionality is disabled and appropriate error messages
are returned to users.
Use Cases:
- Security-conscious deployments where MCP is not needed
- Reducing attack surface
- Compliance requirements that prohibit certain protocol support
Environment variable: LOCALAI_DISABLE_MCP=true
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
* docs: Add documentation for LOCALAI_DISABLE_MCP environment variable
- Add section explaining how to disable MCP support using environment variable
- Document use cases for disabling MCP
- Provide examples for CLI and Docker usage
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
---------
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
* fix: Add timeout-based wait for model deletion completion
- Replace simple polling loop with context-based timeout (5 minutes)
- Use select statement for cleaner timeout handling
- Added proper logging for timeout case
- This addresses the code review comment about using context with timeout instead of dangerous polling approach
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* fix: replace goto statements with break in model deletion loop (fixes CI compilation error)
Signed-off-by: LocalAI [bot] <localai-bot@noreply.github.com>
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: LocalAI [bot] <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: LocalAI [bot] <localai-bot@noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Add model storage size display and RAM warning in Models tab
- Backend (ui_api.go):
- Added getDirectorySize() helper function to calculate total size of model files
- Added storageSize, ramTotal, ramUsed, ramUsagePercent to /api/models endpoint response
- Uses xsysinfo.GetSystemRAMInfo() for RAM information
- Frontend (models.html):
- Added storageSize, ramTotal, ramUsed, ramUsagePercent to Alpine.js data object
- Added formatBytes() helper for human-readable byte formatting
- Display storage size in hero header with blue indicator
- Show warning banner when storage exceeds RAM (model too large for system)
Addresses: https://github.com/mudler/LocalAI/issues/6251
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
fix: reload model configuration after editing (issue #8647)
- Add *model.ModelLoader parameter to EditModelEndpoint
- Call ml.ShutdownModel() after saving config to unload the running model
- Model will be reloaded on next inference request with new settings (e.g., context_size)
- Update route registration to pass ml to EditModelEndpoint
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
* feat(musicgen): add ace-step and UI interface
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Correctly handle model dir
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop auto-download
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add to models, fixup UIs icons
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Update docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* l4t13 is incompatbile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* avoid pinning version for cuda12
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop l4t12
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Debug
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop openai video endpoint (is not complete)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add download button
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: resolve duplicate MCP route registration causing 50% failure rate
Fixes#7772
The issue was caused by duplicate registration of the MCP endpoint
/mcp/v1/chat/completions in both openai.go and localai.go, leading
to a race condition where requests would randomly hit different
handlers with incompatible behaviors.
Changes:
- Removed duplicate MCP route registration from openai.go
- Kept the localai.MCPStreamEndpoint as the canonical handler
- Added all three MCP route patterns for backward compatibility:
* /v1/mcp/chat/completions
* /mcp/v1/chat/completions
* /mcp/chat/completions
- Added comments to clarify route ownership and prevent future conflicts
- Fixed formatting in ui_api.go
The localai.MCPStreamEndpoint handler is more feature-complete as it
supports both streaming and non-streaming modes, while the removed
openai.MCPCompletionEndpoint only supported synchronous requests.
This eliminates the ~50% failure rate where the cogito library would
receive "Invalid http method" errors when internal HTTP requests were
routed to the wrong handler.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: majiayu000 <1835304752@qq.com>
* Address feedback from review
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: majiayu000 <1835304752@qq.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
* feat: allow to install backends from URL in the WebUI and API
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* trace backends installations
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): improve table view and let items to be sorted
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: add tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: use constants
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This removes any ambiguity from how paths are handled, and at the same
time it uniforms the ui paths with the other paths that don't have a
trailing slash
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(agent): agent jobs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Multiple webhooks, simplify
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Do not use cron with seconds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Create separate pages for details
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Detect if no models have MCP configuration, show wizard
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make services test to run
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): add watchdog settings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Do not re-read env
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Some refactor, move other settings to runtime (p2p)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add API Keys handling
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Allow to disable runtime settings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Documentation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* show MCP toggle in index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop context default
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move management to separate section
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make index to redirect to chat
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use logo in index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* work out the wizard in the front-page
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(mcp): add LocalAI endpoint to stream live results of the agent
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* wip
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Refactoring
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* MCP UX integration
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Enhance UX
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Support also non-SSE
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): allow to cancel ops
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Improve progress text
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Cancel queued ops, don't show up message cancellation always
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: fixup displaying of total progress over multiple files
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: initial hook to install elements directly
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP: ui changes
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move HF api client to pkg
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add simple importer for gguf files
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add opcache
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* wire importers to CLI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add omitempty to config fields
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add MLX importer
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small refactors to star to use HF for discovery
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Common preferences
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add support to bare HF repos
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(importer/llama.cpp): add support for mmproj files
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add mmproj quants to common preferences
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix vlm usage in tokenizer mode with llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): show stats in chat, improve style
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Markdown, small improvements
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Display token/sec into stats
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Minor enhancement
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Revert "Fixups"
This reverts commit ab1b3d6da9.
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(ui): use Alpine.js and drop HTMX
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Display pending ops
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Show in progress ops
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* more stable sorting
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* minor fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix clipboard copy
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Cleanup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP - add endpoint
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Rename
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Wire the Completion API
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to make it functional
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Almost functional
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Bump golang versions used in tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add description of the tool
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make it working
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small optimizations
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Cleanup/refactor
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Update docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>