Compare commits

...

17 Commits

Author SHA1 Message Date
LocalAI [bot]
db6ba4ef07 chore: remove install.sh script and documentation references (#8643)
* chore: remove install.sh script and documentation references

- Delete docs/static/install.sh (broken installer causing issues)
- Remove One-Line Installer section from linux.md documentation
- Remove install.sh references from installation/_index.en.md
- Remove install.sh warning and commands from README.md

Closes #8032

* fix: add missing closing braces to notice shortcode
2026-02-24 08:36:25 +01:00
dependabot[bot]
d19dcac863 chore(deps): bump github.com/mudler/cogito from 0.9.1-0.20260217143801-bb7f986ed2c7 to 0.9.1 (#8632)
chore(deps): bump github.com/mudler/cogito

Bumps [github.com/mudler/cogito](https://github.com/mudler/cogito) from 0.9.1-0.20260217143801-bb7f986ed2c7 to 0.9.1.
- [Release notes](https://github.com/mudler/cogito/releases)
- [Commits](https://github.com/mudler/cogito/commits/v0.9.1)

---
updated-dependencies:
- dependency-name: github.com/mudler/cogito
  dependency-version: 0.9.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-24 01:29:47 +00:00
dependabot[bot]
fd42675bec chore(deps): bump goreleaser/goreleaser-action from 6 to 7 (#8634)
Bumps [goreleaser/goreleaser-action](https://github.com/goreleaser/goreleaser-action) from 6 to 7.
- [Release notes](https://github.com/goreleaser/goreleaser-action/releases)
- [Commits](https://github.com/goreleaser/goreleaser-action/compare/v6...v7)

---
updated-dependencies:
- dependency-name: goreleaser/goreleaser-action
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-23 23:27:49 +01:00
dependabot[bot]
3391538806 chore(deps): bump actions/stale from 10.1.1 to 10.2.0 (#8633)
Bumps [actions/stale](https://github.com/actions/stale) from 10.1.1 to 10.2.0.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](997185467f...b5d41d4e1d)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-version: 10.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-23 23:27:20 +01:00
dependabot[bot]
c4f879c4ea chore(deps): bump github.com/gpustack/gguf-parser-go from 0.23.1 to 0.24.0 (#8631)
chore(deps): bump github.com/gpustack/gguf-parser-go

Bumps [github.com/gpustack/gguf-parser-go](https://github.com/gpustack/gguf-parser-go) from 0.23.1 to 0.24.0.
- [Release notes](https://github.com/gpustack/gguf-parser-go/releases)
- [Commits](https://github.com/gpustack/gguf-parser-go/compare/v0.23.1...v0.24.0)

---
updated-dependencies:
- dependency-name: github.com/gpustack/gguf-parser-go
  dependency-version: 0.24.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-23 23:26:41 +01:00
dependabot[bot]
b7e0de54fe chore(deps): bump github.com/anthropics/anthropic-sdk-go from 1.22.0 to 1.26.0 (#8630)
chore(deps): bump github.com/anthropics/anthropic-sdk-go

Bumps [github.com/anthropics/anthropic-sdk-go](https://github.com/anthropics/anthropic-sdk-go) from 1.22.0 to 1.26.0.
- [Release notes](https://github.com/anthropics/anthropic-sdk-go/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-go/compare/v1.22.0...v1.26.0)

---
updated-dependencies:
- dependency-name: github.com/anthropics/anthropic-sdk-go
  dependency-version: 1.26.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-23 23:26:15 +01:00
dependabot[bot]
f0868acdf3 chore(deps): bump fyne.io/fyne/v2 from 2.7.2 to 2.7.3 (#8629)
Bumps [fyne.io/fyne/v2](https://github.com/fyne-io/fyne) from 2.7.2 to 2.7.3.
- [Release notes](https://github.com/fyne-io/fyne/releases)
- [Changelog](https://github.com/fyne-io/fyne/blob/v2.7.3/CHANGELOG.md)
- [Commits](https://github.com/fyne-io/fyne/compare/v2.7.2...v2.7.3)

---
updated-dependencies:
- dependency-name: fyne.io/fyne/v2
  dependency-version: 2.7.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-23 23:25:46 +01:00
LocalAI [bot]
9a5b5ee8a9 chore: ⬆️ Update ggml-org/llama.cpp to b68a83e641b3ebe6465970b34e99f3f0e0a0b21a (#8628)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-23 22:02:40 +00:00
Lukas Schaefer
ed0bfb8732 fix: rename json_verbose to verbose_json (#8627)
Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>
2026-02-23 17:57:06 +00:00
Richard Palethorpe
be84b1d258 feat(traces): Use accordian instead of pop-ups (#8626)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-23 13:07:41 +01:00
Andres
cbedcc9091 fix(api): Downgrade health/readiness check to debug (#8625)
Downgrade health/readiness check to debug

Signed-off-by: Andres Smith <andressmithdev@pm.me>
2026-02-23 11:58:04 +01:00
Andres
e45d63c86e fix(cli): Fix watchdog running constantly and spamming logs (#8624)
* Fix watchdog running constantly and spamming logs

Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Update docs

Signed-off-by: Andres Smith <andressmithdev@pm.me>

---------

Signed-off-by: Andres Smith <andressmithdev@pm.me>
2026-02-23 11:57:28 +01:00
LocalAI [bot]
f40c8dd0ce chore: ⬆️ Update ggml-org/llama.cpp to 2b6dfe824de8600c061ef91ce5cc5c307f97112c (#8622)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-23 09:30:58 +00:00
LocalAI [bot]
559ab99890 docs: update diffusers multi-GPU documentation to mention tensor_parallel_size configuration (#8621)
* docs: update diffusers multi-GPU documentation to mention tensor_parallel_size configuration

* chore: revert backend/python/diffusers/README.md to original content

---------

Co-authored-by: Your Name <you@example.com>
2026-02-22 18:17:23 +01:00
LocalAI [bot]
91f2dd5820 chore: ⬆️ Update ggml-org/llama.cpp to f75c4e8bf52ea480ece07fd3d9a292f1d7f04bc5 (#8619)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-22 13:20:08 +01:00
LocalAI [bot]
8250815763 docs: ⬆️ update docs version mudler/LocalAI (#8618)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-21 21:18:40 +00:00
Richard Palethorpe
b1b67b973e fix(realtime): Add functions to conversation history (#8616)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-21 19:03:49 +01:00
21 changed files with 266 additions and 1270 deletions

View File

@@ -18,7 +18,7 @@ jobs:
with: with:
go-version: 1.23 go-version: 1.23
- name: Run GoReleaser - name: Run GoReleaser
uses: goreleaser/goreleaser-action@v6 uses: goreleaser/goreleaser-action@v7
with: with:
version: v2.11.0 version: v2.11.0
args: release --clean args: release --clean

View File

@@ -11,7 +11,7 @@ jobs:
if: github.repository == 'mudler/LocalAI' if: github.repository == 'mudler/LocalAI'
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/stale@997185467fa4f803885201cee163a9f38240193d # v9 - uses: actions/stale@b5d41d4e1d5dceea10e7104786b73624c18a190f # v9
with: with:
stale-issue-message: 'This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.' stale-issue-message: 'This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.'
stale-pr-message: 'This PR is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 10 days.' stale-pr-message: 'This PR is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 10 days.'

1
CLAUDE.md Symbolic link
View File

@@ -0,0 +1 @@
AGENTS.md

View File

@@ -93,16 +93,7 @@ Liking LocalAI? LocalAI is part of an integrated suite of AI infrastructure tool
## 💻 Quickstart ## 💻 Quickstart
> ⚠️ **Note:** The `install.sh` script is currently experiencing issues due to the heavy changes currently undergoing in LocalAI and may produce broken or misconfigured installations. Please use Docker installation (see below) or manual binary installation until [issue #8032](https://github.com/mudler/LocalAI/issues/8032) is resolved.
Run the installer script:
```bash
# Basic installation
curl https://localai.io/install.sh | sh
```
For more installation options, see [Installer Options](https://localai.io/installation/).
### macOS Download: ### macOS Download:

View File

@@ -1,5 +1,5 @@
LLAMA_VERSION?=ba3b9c8844aca35ecb40d31886686326f22d2214 LLAMA_VERSION?=b68a83e641b3ebe6465970b34e99f3f0e0a0b21a
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
CMAKE_ARGS?= CMAKE_ARGS?=

View File

@@ -71,6 +71,7 @@ type RunCMD struct {
WatchdogIdleTimeout string `env:"LOCALAI_WATCHDOG_IDLE_TIMEOUT,WATCHDOG_IDLE_TIMEOUT" default:"15m" help:"Threshold beyond which an idle backend should be stopped" group:"backends"` WatchdogIdleTimeout string `env:"LOCALAI_WATCHDOG_IDLE_TIMEOUT,WATCHDOG_IDLE_TIMEOUT" default:"15m" help:"Threshold beyond which an idle backend should be stopped" group:"backends"`
EnableWatchdogBusy bool `env:"LOCALAI_WATCHDOG_BUSY,WATCHDOG_BUSY" default:"false" help:"Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout" group:"backends"` EnableWatchdogBusy bool `env:"LOCALAI_WATCHDOG_BUSY,WATCHDOG_BUSY" default:"false" help:"Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout" group:"backends"`
WatchdogBusyTimeout string `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"` WatchdogBusyTimeout string `env:"LOCALAI_WATCHDOG_BUSY_TIMEOUT,WATCHDOG_BUSY_TIMEOUT" default:"5m" help:"Threshold beyond which a busy backend should be stopped" group:"backends"`
WatchdogInterval string `env:"LOCALAI_WATCHDOG_INTERVAL,WATCHDOG_INTERVAL" default:"500ms" help:"Interval between watchdog checks (e.g., 500ms, 5s, 1m) (default: 500ms)" group:"backends"`
EnableMemoryReclaimer bool `env:"LOCALAI_MEMORY_RECLAIMER,MEMORY_RECLAIMER,LOCALAI_GPU_RECLAIMER,GPU_RECLAIMER" default:"false" help:"Enable memory threshold monitoring to auto-evict backends when memory usage exceeds threshold (uses GPU VRAM if available, otherwise RAM)" group:"backends"` EnableMemoryReclaimer bool `env:"LOCALAI_MEMORY_RECLAIMER,MEMORY_RECLAIMER,LOCALAI_GPU_RECLAIMER,GPU_RECLAIMER" default:"false" help:"Enable memory threshold monitoring to auto-evict backends when memory usage exceeds threshold (uses GPU VRAM if available, otherwise RAM)" group:"backends"`
MemoryReclaimerThreshold float64 `env:"LOCALAI_MEMORY_RECLAIMER_THRESHOLD,MEMORY_RECLAIMER_THRESHOLD,LOCALAI_GPU_RECLAIMER_THRESHOLD,GPU_RECLAIMER_THRESHOLD" default:"0.95" help:"Memory usage threshold (0.0-1.0) that triggers backend eviction (default 0.95 = 95%%)" group:"backends"` MemoryReclaimerThreshold float64 `env:"LOCALAI_MEMORY_RECLAIMER_THRESHOLD,MEMORY_RECLAIMER_THRESHOLD,LOCALAI_GPU_RECLAIMER_THRESHOLD,GPU_RECLAIMER_THRESHOLD" default:"0.95" help:"Memory usage threshold (0.0-1.0) that triggers backend eviction (default 0.95 = 95%%)" group:"backends"`
ForceEvictionWhenBusy bool `env:"LOCALAI_FORCE_EVICTION_WHEN_BUSY,FORCE_EVICTION_WHEN_BUSY" default:"false" help:"Force eviction even when models have active API calls (default: false for safety)" group:"backends"` ForceEvictionWhenBusy bool `env:"LOCALAI_FORCE_EVICTION_WHEN_BUSY,FORCE_EVICTION_WHEN_BUSY" default:"false" help:"Force eviction even when models have active API calls (default: false for safety)" group:"backends"`
@@ -215,6 +216,13 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
} }
opts = append(opts, config.SetWatchDogBusyTimeout(dur)) opts = append(opts, config.SetWatchDogBusyTimeout(dur))
} }
if r.WatchdogInterval != "" {
dur, err := time.ParseDuration(r.WatchdogInterval)
if err != nil {
return err
}
opts = append(opts, config.SetWatchDogInterval(dur))
}
} }
// Handle memory reclaimer (uses GPU VRAM if available, otherwise RAM) // Handle memory reclaimer (uses GPU VRAM if available, otherwise RAM)

View File

@@ -31,8 +31,8 @@ type TranscriptCMD struct {
ModelsPath string `env:"LOCALAI_MODELS_PATH,MODELS_PATH" type:"path" default:"${basepath}/models" help:"Path containing models used for inferencing" group:"storage"` ModelsPath string `env:"LOCALAI_MODELS_PATH,MODELS_PATH" type:"path" default:"${basepath}/models" help:"Path containing models used for inferencing" group:"storage"`
BackendGalleries string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"backends" default:"${backends}"` BackendGalleries string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"backends" default:"${backends}"`
Prompt string `short:"p" help:"Previous transcribed text or words that hint at what the model should expect"` Prompt string `short:"p" help:"Previous transcribed text or words that hint at what the model should expect"`
ResponseFormat schema.TranscriptionResponseFormatType `short:"f" default:"" help:"Response format for Whisper models, can be one of (txt, lrc, srt, vtt, json, json_verbose)"` ResponseFormat schema.TranscriptionResponseFormatType `short:"f" default:"" help:"Response format for Whisper models, can be one of (txt, lrc, srt, vtt, json, verbose_json)"`
PrettyPrint bool `help:"Used with response_format json or json_verbose for pretty printing"` PrettyPrint bool `help:"Used with response_format json or verbose_json for pretty printing"`
} }
func (t *TranscriptCMD) Run(ctx *cliContext.Context) error { func (t *TranscriptCMD) Run(ctx *cliContext.Context) error {

View File

@@ -98,10 +98,11 @@ func NewApplicationConfig(o ...AppOption) *ApplicationConfig {
Context: context.Background(), Context: context.Background(),
UploadLimitMB: 15, UploadLimitMB: 15,
Debug: true, Debug: true,
AgentJobRetentionDays: 30, // Default: 30 days AgentJobRetentionDays: 30, // Default: 30 days
LRUEvictionMaxRetries: 30, // Default: 30 retries LRUEvictionMaxRetries: 30, // Default: 30 retries
LRUEvictionRetryInterval: 1 * time.Second, // Default: 1 second LRUEvictionRetryInterval: 1 * time.Second, // Default: 1 second
TracingMaxItems: 1024, WatchDogInterval: 500 * time.Millisecond, // Default: 500ms
TracingMaxItems: 1024,
PathWithoutAuth: []string{ PathWithoutAuth: []string{
"/static/", "/static/",
"/generated-audio/", "/generated-audio/",
@@ -208,6 +209,12 @@ func SetWatchDogIdleTimeout(t time.Duration) AppOption {
} }
} }
func SetWatchDogInterval(t time.Duration) AppOption {
return func(o *ApplicationConfig) {
o.WatchDogInterval = t
}
}
// EnableMemoryReclaimer enables memory threshold monitoring. // EnableMemoryReclaimer enables memory threshold monitoring.
// When enabled, the watchdog will evict backends if memory usage exceeds the threshold. // When enabled, the watchdog will evict backends if memory usage exceeds the threshold.
// Works with GPU VRAM if available, otherwise uses system RAM. // Works with GPU VRAM if available, otherwise uses system RAM.
@@ -642,7 +649,7 @@ func (o *ApplicationConfig) ToRuntimeSettings() RuntimeSettings {
AutoloadBackendGalleries: &autoloadBackendGalleries, AutoloadBackendGalleries: &autoloadBackendGalleries,
ApiKeys: &apiKeys, ApiKeys: &apiKeys,
AgentJobRetentionDays: &agentJobRetentionDays, AgentJobRetentionDays: &agentJobRetentionDays,
OpenResponsesStoreTTL: &openResponsesStoreTTL, OpenResponsesStoreTTL: &openResponsesStoreTTL,
} }
} }

View File

@@ -29,6 +29,8 @@ import (
//go:embed static/* //go:embed static/*
var embedDirStatic embed.FS var embedDirStatic embed.FS
var quietPaths = []string{"/api/operations", "/api/resources", "/healthz", "/readyz"}
// @title LocalAI API // @title LocalAI API
// @version 2.0.0 // @version 2.0.0
// @description The LocalAI Rest API. // @description The LocalAI Rest API.
@@ -109,10 +111,17 @@ func API(application *application.Application) (*echo.Echo, error) {
res := c.Response() res := c.Response()
err := next(c) err := next(c)
// Fix for #7989: Reduce log verbosity of Web UI polling and resources API // Fix for #7989: Reduce log verbosity of Web UI polling, resources API, and health checks
// If the path is /api/operations or /api/resources and the request was successful (200), // These paths are logged at DEBUG level (hidden by default) instead of INFO.
// we log it at DEBUG level (hidden by default) instead of INFO. isQuietPath := false
if (req.URL.Path == "/api/operations" || req.URL.Path == "/api/resources") && res.Status == 200 { for _, path := range quietPaths {
if req.URL.Path == path {
isQuietPath = true
break
}
}
if isQuietPath && res.Status == 200 {
xlog.Debug("HTTP request", "method", req.Method, "path", req.URL.Path, "status", res.Status) xlog.Debug("HTTP request", "method", req.Method, "path", req.URL.Path, "status", res.Status)
} else { } else {
xlog.Info("HTTP request", "method", req.Method, "path", req.URL.Path, "status", res.Status) xlog.Info("HTTP request", "method", req.Method, "path", req.URL.Path, "status", res.Status)

View File

@@ -1046,6 +1046,27 @@ func triggerResponse(session *Session, conv *Conversation, c *LockedWebsocket, o
Content: content.Text, Content: content.Text,
}) })
} }
} else if item.FunctionCall != nil {
conversationHistory = append(conversationHistory, schema.Message{
Role: string(types.MessageRoleAssistant),
ToolCalls: []schema.ToolCall{
{
ID: item.FunctionCall.CallID,
Type: "function",
FunctionCall: schema.FunctionCall{
Name: item.FunctionCall.Name,
Arguments: item.FunctionCall.Arguments,
},
},
},
})
} else if item.FunctionCallOutput != nil {
conversationHistory = append(conversationHistory, schema.Message{
Role: "tool",
Name: item.FunctionCallOutput.CallID,
Content: item.FunctionCallOutput.Output,
StringContent: item.FunctionCallOutput.Output,
})
} }
} }
conv.Lock.Unlock() conv.Lock.Unlock()

View File

@@ -135,26 +135,44 @@
<table class="w-full border-collapse"> <table class="w-full border-collapse">
<thead> <thead>
<tr class="border-b border-[var(--color-bg-secondary)]"> <tr class="border-b border-[var(--color-bg-secondary)]">
<th class="w-8 p-2"></th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Method</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Method</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Path</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Path</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Status</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Status</th>
<th class="text-right p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Actions</th>
</tr> </tr>
</thead> </thead>
<tbody> <template x-for="(trace, index) in traces" :key="index">
<template x-for="(trace, index) in traces" :key="index"> <tbody>
<tr class="hover:bg-[var(--color-bg-secondary)]/50 border-b border-[var(--color-bg-secondary)] transition-colors"> <tr @click="toggleTrace(index)"
class="cursor-pointer hover:bg-[var(--color-bg-secondary)]/50 border-b border-[var(--color-bg-secondary)] transition-colors">
<td class="p-2 w-8 text-center">
<i class="fas fa-chevron-right text-xs text-[var(--color-text-secondary)] transition-transform duration-200"
:class="expandedTraces[index] ? 'rotate-90' : ''"></i>
</td>
<td class="p-2" x-text="trace.request.method"></td> <td class="p-2" x-text="trace.request.method"></td>
<td class="p-2" x-text="trace.request.path"></td> <td class="p-2" x-text="trace.request.path"></td>
<td class="p-2" x-text="trace.response.status"></td> <td class="p-2" x-text="trace.response.status"></td>
<td class="p-2 text-right"> </tr>
<button @click="showDetails(index)" class="text-[var(--color-primary)]/60 hover:text-[var(--color-primary)] hover:bg-[var(--color-primary)]/10 rounded p-1 transition-colors"> <tr x-show="expandedTraces[index]">
<i class="fas fa-eye text-xs"></i> <td colspan="4" class="p-0">
</button> <div class="p-4 bg-[var(--color-bg-secondary)]/30 border-b border-[var(--color-bg-secondary)]">
<div class="grid grid-cols-1 lg:grid-cols-2 gap-4">
<div>
<h4 class="text-sm font-semibold text-[var(--color-text-primary)] mb-2">Request Body</h4>
<pre class="overflow-auto max-h-[70vh] p-3 rounded-lg bg-[var(--color-bg-primary)] border border-[var(--color-border-subtle)] text-xs font-mono text-[var(--color-text-secondary)] whitespace-pre-wrap break-words"
x-text="formatTraceBody(trace.request.body)"></pre>
</div>
<div>
<h4 class="text-sm font-semibold text-[var(--color-text-primary)] mb-2">Response Body</h4>
<pre class="overflow-auto max-h-[70vh] p-3 rounded-lg bg-[var(--color-bg-primary)] border border-[var(--color-border-subtle)] text-xs font-mono text-[var(--color-text-secondary)] whitespace-pre-wrap break-words"
x-text="formatTraceBody(trace.response.body)"></pre>
</div>
</div>
</div>
</td> </td>
</tr> </tr>
</template> </tbody>
</tbody> </template>
</table> </table>
<div x-show="traces.length === 0" class="text-center py-8 text-[var(--color-text-secondary)] text-sm"> <div x-show="traces.length === 0" class="text-center py-8 text-[var(--color-text-secondary)] text-sm">
No API traces recorded yet. No API traces recorded yet.
@@ -168,18 +186,23 @@
<table class="w-full border-collapse"> <table class="w-full border-collapse">
<thead> <thead>
<tr class="border-b border-[var(--color-bg-secondary)]"> <tr class="border-b border-[var(--color-bg-secondary)]">
<th class="w-8 p-2"></th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Type</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Type</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Timestamp</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Timestamp</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Model</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Model</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Summary</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Summary</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Duration</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Duration</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Status</th> <th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Status</th>
<th class="text-right p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Actions</th>
</tr> </tr>
</thead> </thead>
<tbody> <template x-for="(trace, index) in backendTraces" :key="index">
<template x-for="(trace, index) in backendTraces" :key="index"> <tbody>
<tr class="hover:bg-[var(--color-bg-secondary)]/50 border-b border-[var(--color-bg-secondary)] transition-colors"> <tr @click="toggleBackendTrace(index)"
class="cursor-pointer hover:bg-[var(--color-bg-secondary)]/50 border-b border-[var(--color-bg-secondary)] transition-colors">
<td class="p-2 w-8 text-center">
<i class="fas fa-chevron-right text-xs text-[var(--color-text-secondary)] transition-transform duration-200"
:class="expandedBackendTraces[index] ? 'rotate-90' : ''"></i>
</td>
<td class="p-2"> <td class="p-2">
<span class="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium" <span class="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium"
:class="getTypeClass(trace.type)" :class="getTypeClass(trace.type)"
@@ -197,14 +220,82 @@
<i class="fas fa-times-circle text-red-500 text-xs" :title="trace.error"></i> <i class="fas fa-times-circle text-red-500 text-xs" :title="trace.error"></i>
</template> </template>
</td> </td>
<td class="p-2 text-right"> </tr>
<button @click="showBackendDetails(index)" class="text-[var(--color-primary)]/60 hover:text-[var(--color-primary)] hover:bg-[var(--color-primary)]/10 rounded p-1 transition-colors"> <tr x-show="expandedBackendTraces[index]">
<i class="fas fa-eye text-xs"></i> <td colspan="7" class="p-0">
</button> <div class="p-4 bg-[var(--color-bg-secondary)]/30 border-b border-[var(--color-bg-secondary)]">
<!-- Header info -->
<div class="grid grid-cols-2 md:grid-cols-4 gap-3 mb-4">
<div class="bg-[var(--color-bg-primary)] rounded-lg p-3 border border-[var(--color-border-subtle)]">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Type</div>
<span class="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium"
:class="getTypeClass(trace.type)"
x-text="trace.type"></span>
</div>
<div class="bg-[var(--color-bg-primary)] rounded-lg p-3 border border-[var(--color-border-subtle)]">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Model</div>
<div class="text-sm font-medium" x-text="trace.model_name || '-'"></div>
</div>
<div class="bg-[var(--color-bg-primary)] rounded-lg p-3 border border-[var(--color-border-subtle)]">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Backend</div>
<div class="text-sm font-medium" x-text="trace.backend || '-'"></div>
</div>
<div class="bg-[var(--color-bg-primary)] rounded-lg p-3 border border-[var(--color-border-subtle)]">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Duration</div>
<div class="text-sm font-medium" x-text="formatDuration(trace.duration)"></div>
</div>
</div>
<!-- Error banner -->
<div x-show="trace.error" class="bg-red-500/10 border border-red-500/30 rounded-lg p-3 mb-4">
<div class="flex items-center gap-2">
<i class="fas fa-exclamation-triangle text-red-500 text-sm"></i>
<span class="text-sm text-red-400" x-text="trace.error"></span>
</div>
</div>
<!-- Data fields as nested accordions -->
<template x-if="trace.data && Object.keys(trace.data).length > 0">
<div>
<h4 class="text-sm font-semibold text-[var(--color-text-primary)] mb-2">Data Fields</h4>
<div class="border border-[var(--color-border-subtle)] rounded-lg overflow-hidden">
<template x-for="[key, value] in Object.entries(trace.data)" :key="key">
<div class="border-b border-[var(--color-border-subtle)] last:border-b-0">
<!-- Field header row -->
<div @click="isLargeValue(value) && toggleBackendField(index, key)"
class="flex items-center gap-2 px-3 py-2 hover:bg-[var(--color-bg-primary)]/50 transition-colors"
:class="isLargeValue(value) ? 'cursor-pointer' : ''">
<template x-if="isLargeValue(value)">
<i class="fas fa-chevron-right text-[10px] text-[var(--color-text-secondary)] transition-transform duration-200 w-3 flex-shrink-0"
:class="isBackendFieldExpanded(index, key) ? 'rotate-90' : ''"></i>
</template>
<template x-if="!isLargeValue(value)">
<span class="w-3 flex-shrink-0"></span>
</template>
<span class="text-sm font-mono text-[var(--color-primary)] flex-shrink-0" x-text="key"></span>
<template x-if="!isLargeValue(value)">
<span class="font-mono text-xs text-[var(--color-text-secondary)]" x-text="formatValue(value)"></span>
</template>
<template x-if="isLargeValue(value) && !isBackendFieldExpanded(index, key)">
<span class="text-xs text-[var(--color-text-secondary)] truncate" x-text="truncateValue(value, 120)"></span>
</template>
</div>
<!-- Expanded field value -->
<div x-show="isLargeValue(value) && isBackendFieldExpanded(index, key)"
class="px-3 pb-3">
<pre class="overflow-auto max-h-[70vh] p-3 rounded-lg bg-[var(--color-bg-primary)] border border-[var(--color-border-subtle)] text-xs font-mono text-[var(--color-text-secondary)] whitespace-pre-wrap break-words"
x-text="formatLargeValue(value)"></pre>
</div>
</div>
</template>
</div>
</div>
</template>
</div>
</td> </td>
</tr> </tr>
</template> </tbody>
</tbody> </template>
</table> </table>
<div x-show="backendTraces.length === 0" class="text-center py-8 text-[var(--color-text-secondary)] text-sm"> <div x-show="backendTraces.length === 0" class="text-center py-8 text-[var(--color-text-secondary)] text-sm">
No backend traces recorded yet. No backend traces recorded yet.
@@ -212,149 +303,20 @@
</div> </div>
</div> </div>
<!-- API Trace Details Modal -->
<div x-show="selectedTrace !== null" class="fixed inset-0 bg-black/50 flex items-center justify-center z-50" @click="selectedTrace = null">
<div class="bg-[var(--color-bg-secondary)] rounded-lg p-6 max-w-4xl w-full max-h-[90vh] overflow-auto" @click.stop>
<div class="flex justify-between mb-4">
<h2 class="h3">API Trace Details</h2>
<button @click="selectedTrace = null" class="text-[var(--color-text-secondary)] hover:text-[var(--color-text-primary)]">
<i class="fas fa-times"></i>
</button>
</div>
<div class="grid grid-cols-2 gap-4">
<div>
<h3 class="text-lg font-semibold mb-2">Request Body</h3>
<div id="requestEditor" class="h-96 border border-[var(--color-primary-border)]/20"></div>
</div>
<div>
<h3 class="text-lg font-semibold mb-2">Response Body</h3>
<div id="responseEditor" class="h-96 border border-[var(--color-primary-border)]/20"></div>
</div>
</div>
</div>
</div>
<!-- Backend Trace Details Modal -->
<div x-show="selectedBackendTrace !== null" class="fixed inset-0 bg-black/50 flex items-center justify-center z-50" @click="selectedBackendTrace = null; detailKey = null; detailValue = null;">
<div class="bg-[var(--color-bg-secondary)] rounded-lg p-6 max-w-4xl w-full max-h-[90vh] overflow-auto" @click.stop>
<template x-if="selectedBackendTrace !== null">
<div>
<div class="flex justify-between mb-4">
<h2 class="h3">Backend Trace Details</h2>
<button @click="selectedBackendTrace = null; detailKey = null; detailValue = null;" class="text-[var(--color-text-secondary)] hover:text-[var(--color-text-primary)]">
<i class="fas fa-times"></i>
</button>
</div>
<!-- Header info -->
<div class="grid grid-cols-4 gap-4 mb-4">
<div class="bg-[var(--color-bg-primary)] rounded p-3">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Type</div>
<span class="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium"
:class="getTypeClass(backendTraces[selectedBackendTrace].type)"
x-text="backendTraces[selectedBackendTrace].type"></span>
</div>
<div class="bg-[var(--color-bg-primary)] rounded p-3">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Model</div>
<div class="text-sm font-medium" x-text="backendTraces[selectedBackendTrace].model_name || '-'"></div>
</div>
<div class="bg-[var(--color-bg-primary)] rounded p-3">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Backend</div>
<div class="text-sm font-medium" x-text="backendTraces[selectedBackendTrace].backend || '-'"></div>
</div>
<div class="bg-[var(--color-bg-primary)] rounded p-3">
<div class="text-xs text-[var(--color-text-secondary)] mb-1">Duration</div>
<div class="text-sm font-medium" x-text="formatDuration(backendTraces[selectedBackendTrace].duration)"></div>
</div>
</div>
<!-- Error banner -->
<div x-show="backendTraces[selectedBackendTrace].error" class="bg-red-500/10 border border-red-500/30 rounded-lg p-3 mb-4">
<div class="flex items-center gap-2">
<i class="fas fa-exclamation-triangle text-red-500 text-sm"></i>
<span class="text-sm text-red-400" x-text="backendTraces[selectedBackendTrace].error"></span>
</div>
</div>
<!-- Data fields table -->
<div class="overflow-x-auto">
<table class="w-full border-collapse">
<thead>
<tr class="border-b border-[var(--color-bg-primary)]">
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)] w-1/4">Field</th>
<th class="text-left p-2 text-xs font-semibold text-[var(--color-text-secondary)]">Value</th>
</tr>
</thead>
<tbody>
<template x-for="[key, value] in getDataEntries(selectedBackendTrace)" :key="key">
<tr class="border-b border-[var(--color-bg-primary)] hover:bg-[var(--color-bg-primary)]/50 transition-colors">
<td class="p-2 text-sm font-mono text-[var(--color-primary)]" x-text="key"></td>
<td class="p-2 text-sm">
<template x-if="isLargeValue(value)">
<button @click="showValueDetail(key, value)"
class="text-left max-w-full">
<span class="block truncate max-w-lg text-[var(--color-text-secondary)]" x-text="truncateValue(value, 120)"></span>
<span class="text-xs text-[var(--color-primary)] hover:underline mt-0.5 inline-block">View full value</span>
</button>
</template>
<template x-if="!isLargeValue(value)">
<span class="font-mono text-xs" x-text="formatValue(value)"></span>
</template>
</td>
</tr>
</template>
</tbody>
</table>
</div>
</div>
</template>
</div>
</div>
<!-- Value Detail Modal -->
<div x-show="detailValue !== null" class="fixed inset-0 bg-black/50 flex items-center justify-center z-[60]" @click="detailValue = null; detailKey = null;">
<div class="bg-[var(--color-bg-secondary)] rounded-lg p-6 max-w-4xl w-full max-h-[90vh] overflow-auto" @click.stop>
<div class="flex justify-between mb-4">
<h2 class="h3 font-mono" x-text="detailKey"></h2>
<button @click="detailValue = null; detailKey = null;" class="text-[var(--color-text-secondary)] hover:text-[var(--color-text-primary)]">
<i class="fas fa-times"></i>
</button>
</div>
<div id="detailEditor" class="h-[70vh] border border-[var(--color-primary-border)]/20"></div>
</div>
</div>
</div> </div>
</div> </div>
<!-- CodeMirror -->
<link rel="stylesheet" href="static/assets/codemirror.min.css">
<script src="static/assets/codemirror.min.js"></script>
<script src="static/assets/javascript.min.js"></script>
<!-- Styles from model-editor -->
<style>
.CodeMirror {
height: 100% !important;
font-family: monospace;
}
</style>
<script> <script>
function tracesApp() { function tracesApp() {
return { return {
activeTab: 'api', activeTab: 'api',
traces: [], traces: [],
backendTraces: [], backendTraces: [],
selectedTrace: null, expandedTraces: {},
selectedBackendTrace: null, expandedBackendTraces: {},
detailKey: null, expandedBackendFields: {},
detailValue: null,
requestEditor: null,
responseEditor: null,
detailEditor: null,
notifications: [], notifications: [],
settings: { settings: {
enable_tracing: false, enable_tracing: false,
@@ -474,6 +436,7 @@ function tracesApp() {
if (confirm('Clear all API traces?')) { if (confirm('Clear all API traces?')) {
await fetch('/api/traces/clear', { method: 'POST' }); await fetch('/api/traces/clear', { method: 'POST' });
this.traces = []; this.traces = [];
this.expandedTraces = {};
} }
}, },
@@ -481,101 +444,67 @@ function tracesApp() {
if (confirm('Clear all backend traces?')) { if (confirm('Clear all backend traces?')) {
await fetch('/api/backend-traces/clear', { method: 'POST' }); await fetch('/api/backend-traces/clear', { method: 'POST' });
this.backendTraces = []; this.backendTraces = [];
this.expandedBackendTraces = {};
this.expandedBackendFields = {};
} }
}, },
showDetails(index) { toggleTrace(index) {
this.selectedTrace = index; this.expandedTraces = {
this.$nextTick(() => { ...this.expandedTraces,
const trace = this.traces[index]; [index]: !this.expandedTraces[index]
};
const decodeBase64 = (base64) => {
const binaryString = atob(base64);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
return new TextDecoder().decode(bytes);
};
const formatBody = (bodyText) => {
try {
const json = JSON.parse(bodyText);
return JSON.stringify(json, null, 2);
} catch {
return bodyText;
}
};
const reqBody = formatBody(decodeBase64(trace.request.body));
const resBody = formatBody(decodeBase64(trace.response.body));
if (!this.requestEditor) {
this.requestEditor = CodeMirror(document.getElementById('requestEditor'), {
value: reqBody,
mode: 'javascript',
json: true,
theme: 'default',
lineNumbers: true,
readOnly: true,
lineWrapping: true
});
} else {
this.requestEditor.setValue(reqBody);
}
if (!this.responseEditor) {
this.responseEditor = CodeMirror(document.getElementById('responseEditor'), {
value: resBody,
mode: 'javascript',
json: true,
theme: 'default',
lineNumbers: true,
readOnly: true,
lineWrapping: true
});
} else {
this.responseEditor.setValue(resBody);
}
});
}, },
showBackendDetails(index) { toggleBackendTrace(index) {
this.selectedBackendTrace = index; this.expandedBackendTraces = {
...this.expandedBackendTraces,
[index]: !this.expandedBackendTraces[index]
};
}, },
showValueDetail(key, value) { toggleBackendField(index, key) {
this.detailKey = key; const fieldKey = index + '-' + key;
let formatted = ''; this.expandedBackendFields = {
...this.expandedBackendFields,
[fieldKey]: !this.expandedBackendFields[fieldKey]
};
},
isBackendFieldExpanded(index, key) {
return !!this.expandedBackendFields[index + '-' + key];
},
formatTraceBody(body) {
try {
const binaryString = atob(body);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
const text = new TextDecoder().decode(bytes);
try {
return JSON.stringify(JSON.parse(text), null, 2);
} catch {
return text;
}
} catch {
return body || '';
}
},
formatLargeValue(value) {
if (typeof value === 'string') { if (typeof value === 'string') {
try { try {
const parsed = JSON.parse(value); return JSON.stringify(JSON.parse(value), null, 2);
formatted = JSON.stringify(parsed, null, 2);
} catch { } catch {
formatted = value; return value;
} }
} else if (typeof value === 'object') {
formatted = JSON.stringify(value, null, 2);
} else {
formatted = String(value);
} }
this.detailValue = formatted; if (typeof value === 'object') {
return JSON.stringify(value, null, 2);
this.$nextTick(() => { }
const el = document.getElementById('detailEditor'); return String(value);
if (el) {
el.innerHTML = '';
this.detailEditor = CodeMirror(el, {
value: formatted,
mode: 'javascript',
json: true,
theme: 'default',
lineNumbers: true,
readOnly: true,
lineWrapping: true
});
}
});
}, },
formatTimestamp(ts) { formatTimestamp(ts) {
@@ -623,12 +552,6 @@ function tracesApp() {
if (typeof value === 'boolean') return value ? 'true' : 'false'; if (typeof value === 'boolean') return value ? 'true' : 'false';
if (typeof value === 'object') return JSON.stringify(value); if (typeof value === 'object') return JSON.stringify(value);
return String(value); return String(value);
},
getDataEntries(index) {
const trace = this.backendTraces[index];
if (!trace || !trace.data) return [];
return Object.entries(trace.data);
} }
} }
} }

View File

@@ -115,7 +115,7 @@ const (
TranscriptionResponseFormatVtt = TranscriptionResponseFormatType("vtt") TranscriptionResponseFormatVtt = TranscriptionResponseFormatType("vtt")
TranscriptionResponseFormatLrc = TranscriptionResponseFormatType("lrc") TranscriptionResponseFormatLrc = TranscriptionResponseFormatType("lrc")
TranscriptionResponseFormatJson = TranscriptionResponseFormatType("json") TranscriptionResponseFormatJson = TranscriptionResponseFormatType("json")
TranscriptionResponseFormatJsonVerbose = TranscriptionResponseFormatType("json_verbose") TranscriptionResponseFormatJsonVerbose = TranscriptionResponseFormatType("verbose_json")
) )
type ChatCompletionResponseFormat struct { type ChatCompletionResponseFormat struct {

View File

@@ -60,6 +60,22 @@ diffusers:
scheduler_type: "k_dpmpp_sde" scheduler_type: "k_dpmpp_sde"
``` ```
### Multi-GPU Support
For multi-GPU support with diffusers, you need to configure the model with `tensor_parallel_size` set to the number of GPUs you want to use.
```yaml
name: stable-diffusion-multigpu
model: stabilityai/stable-diffusion-xl-base-1.0
backend: diffusers
parameters:
tensor_parallel_size: 2 # Number of GPUs to use
```
The `tensor_parallel_size` parameter is set in the gRPC proto configuration (in `ModelOptions` message, field 55). When this is set to a value greater than 1, the diffusers backend automatically enables `device_map="auto"` to distribute the model across multiple GPUs.
When using diffusers with multiple GPUs, ensure you have sufficient GPU memory across all devices. The model will be automatically distributed across available GPUs. For optimal performance, use GPUs of the same type and memory capacity.
## CUDA(NVIDIA) acceleration ## CUDA(NVIDIA) acceleration
### Requirements ### Requirements

View File

@@ -57,7 +57,7 @@ Result:
--- ---
You can also specify the `response_format` parameter to be one of `lrc`, `srt`, `vtt`, `text`, `json` or `json_verbose` (default): You can also specify the `response_format` parameter to be one of `lrc`, `srt`, `vtt`, `text`, `json` or `verbose_json` (default):
```bash ```bash
## Send the example audio file to the transcriptions endpoint ## Send the example audio file to the transcriptions endpoint
curl http://localhost:8080/v1/audio/transcriptions -H "Content-Type: multipart/form-data" -F file="@$PWD/gb1.ogg" -F model="whisper-1" -F response_format="srt" curl http://localhost:8080/v1/audio/transcriptions -H "Content-Type: multipart/form-data" -F file="@$PWD/gb1.ogg" -F model="whisper-1" -F response_format="srt"

View File

@@ -20,7 +20,7 @@ Choose the installation method that best suits your needs:
1. **[Docker](docker/)** ⭐ **Recommended** - Works on all platforms, easiest setup 1. **[Docker](docker/)** ⭐ **Recommended** - Works on all platforms, easiest setup
2. **[macOS](macos/)** - Download and install the DMG application 2. **[macOS](macos/)** - Download and install the DMG application
3. **[Linux](linux/)** - Install on Linux using binaries (install.sh script currently has issues - see [issue #8032](https://github.com/mudler/LocalAI/issues/8032)) 3. **[Linux](linux/)** - Install on Linux using binaries
4. **[Kubernetes](kubernetes/)** - Deploy LocalAI on Kubernetes clusters 4. **[Kubernetes](kubernetes/)** - Deploy LocalAI on Kubernetes clusters
5. **[Build from Source](build/)** - Build LocalAI from source code 5. **[Build from Source](build/)** - Build LocalAI from source code
@@ -36,6 +36,6 @@ This will start LocalAI. The API will be available at `http://localhost:8080`. F
For other platforms: For other platforms:
- **macOS**: Download the [DMG](macos/) - **macOS**: Download the [DMG](macos/)
- **Linux**: See the [Linux installation guide](linux/) for installation options. **Note:** The `install.sh` script is currently experiencing issues - see [issue #8032](https://github.com/mudler/LocalAI/issues/8032) for details. - **Linux**: See the [Linux installation guide](linux/) for binary installation.
For detailed instructions, see the [Docker installation guide](docker/). For detailed instructions, see the [Docker installation guide](docker/).

View File

@@ -1,69 +1,10 @@
--- ---
title: "Linux Installation" title: "Linux Installation"
description: "Install LocalAI on Linux using the installer script or binaries" description: "Install LocalAI on Linux using binaries"
weight: 3 weight: 3
url: '/installation/linux/' url: '/installation/linux/'
--- ---
## One-Line Installer
{{% notice warning %}}
**The `install.sh` script is currently experiencing issues and may produce broken or misconfigured installations. Please use alternative installation methods (Docker or manual binary installation) until [issue #8032](https://github.com/mudler/LocalAI/issues/8032) is resolved.**
{{% /notice %}}
The fastest way to install LocalAI on Linux is with the installation script:
```bash
curl https://localai.io/install.sh | sh
```
This script will:
- Detect your system architecture
- Download the appropriate LocalAI binary
- Set up the necessary configuration
- Start LocalAI automatically
### Installer Configuration Options
The installer can be configured using environment variables:
```bash
curl https://localai.io/install.sh | VAR=value sh
```
#### Environment Variables
| Environment Variable | Description |
|----------------------|-------------|
| **DOCKER_INSTALL** | Set to `"true"` to enable the installation of Docker images |
| **USE_AIO** | Set to `"true"` to use the all-in-one LocalAI Docker image |
| **USE_VULKAN** | Set to `"true"` to use Vulkan GPU support |
| **API_KEY** | Specify an API key for accessing LocalAI, if required |
| **PORT** | Specifies the port on which LocalAI will run (default is 8080) |
| **THREADS** | Number of processor threads the application should use. Defaults to the number of logical cores minus one |
| **VERSION** | Specifies the version of LocalAI to install. Defaults to the latest available version |
| **MODELS_PATH** | Directory path where LocalAI models are stored (default is `/var/lib/local-ai/models`) |
| **P2P_TOKEN** | Token to use for the federation or for starting workers. See [distributed inferencing documentation]({{%relref "features/distributed_inferencing" %}}) |
| **WORKER** | Set to `"true"` to make the instance a worker (p2p token is required) |
| **FEDERATED** | Set to `"true"` to share the instance with the federation (p2p token is required) |
| **FEDERATED_SERVER** | Set to `"true"` to run the instance as a federation server which forwards requests to the federation (p2p token is required) |
#### Image Selection
The installer will automatically detect your GPU and select the appropriate image. By default, it uses the standard images without extra Python dependencies. You can customize the image selection:
- `USE_AIO=true`: Use all-in-one images that include all dependencies
- `USE_VULKAN=true`: Use Vulkan GPU support instead of vendor-specific GPU support
#### Uninstallation
To uninstall LocalAI installed via the script:
```bash
curl https://localai.io/install.sh | sh -s -- --uninstall
```
## Manual Installation ## Manual Installation
### Download Binary ### Download Binary

View File

@@ -46,6 +46,7 @@ Complete reference for all LocalAI command-line interface (CLI) parameters and e
| `--watchdog-idle-timeout` | `15m` | Threshold beyond which an idle backend should be stopped | `$LOCALAI_WATCHDOG_IDLE_TIMEOUT`, `$WATCHDOG_IDLE_TIMEOUT` | | `--watchdog-idle-timeout` | `15m` | Threshold beyond which an idle backend should be stopped | `$LOCALAI_WATCHDOG_IDLE_TIMEOUT`, `$WATCHDOG_IDLE_TIMEOUT` |
| `--enable-watchdog-busy` | `false` | Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout | `$LOCALAI_WATCHDOG_BUSY`, `$WATCHDOG_BUSY` | | `--enable-watchdog-busy` | `false` | Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout | `$LOCALAI_WATCHDOG_BUSY`, `$WATCHDOG_BUSY` |
| `--watchdog-busy-timeout` | `5m` | Threshold beyond which a busy backend should be stopped | `$LOCALAI_WATCHDOG_BUSY_TIMEOUT`, `$WATCHDOG_BUSY_TIMEOUT` | | `--watchdog-busy-timeout` | `5m` | Threshold beyond which a busy backend should be stopped | `$LOCALAI_WATCHDOG_BUSY_TIMEOUT`, `$WATCHDOG_BUSY_TIMEOUT` |
| `--watchdog-interval` | `500ms` | Interval between watchdog checks (e.g., `500ms`, `5s`, `1m`) | `$LOCALAI_WATCHDOG_INTERVAL`, `$WATCHDOG_INTERVAL` |
| `--force-eviction-when-busy` | `false` | Force eviction even when models have active API calls (default: false for safety). **Warning:** Enabling this can interrupt active requests | `$LOCALAI_FORCE_EVICTION_WHEN_BUSY`, `$FORCE_EVICTION_WHEN_BUSY` | | `--force-eviction-when-busy` | `false` | Force eviction even when models have active API calls (default: false for safety). **Warning:** Enabling this can interrupt active requests | `$LOCALAI_FORCE_EVICTION_WHEN_BUSY`, `$FORCE_EVICTION_WHEN_BUSY` |
| `--lru-eviction-max-retries` | `30` | Maximum number of retries when waiting for busy models to become idle before eviction | `$LOCALAI_LRU_EVICTION_MAX_RETRIES`, `$LRU_EVICTION_MAX_RETRIES` | | `--lru-eviction-max-retries` | `30` | Maximum number of retries when waiting for busy models to become idle before eviction | `$LOCALAI_LRU_EVICTION_MAX_RETRIES`, `$LRU_EVICTION_MAX_RETRIES` |
| `--lru-eviction-retry-interval` | `1s` | Interval between retries when waiting for busy models to become idle (e.g., `1s`, `2s`) | `$LOCALAI_LRU_EVICTION_RETRY_INTERVAL`, `$LRU_EVICTION_RETRY_INTERVAL` | | `--lru-eviction-retry-interval` | `1s` | Interval between retries when waiting for busy models to become idle (e.g., `1s`, `2s`) | `$LOCALAI_LRU_EVICTION_RETRY_INTERVAL`, `$LRU_EVICTION_RETRY_INTERVAL` |

View File

@@ -1,3 +1,3 @@
{ {
"version": "v3.12.0" "version": "v3.12.1"
} }

922
docs/static/install.sh vendored
View File

@@ -1,922 +0,0 @@
#!/bin/sh
# LocalAI Installer Script
# This script installs LocalAI on Linux and macOS systems.
# It automatically detects the system architecture and installs the appropriate version.
# Usage:
# Basic installation:
# curl https://localai.io/install.sh | sh
#
# With environment variables:
# DOCKER_INSTALL=true USE_AIO=true API_KEY=your-key PORT=8080 THREADS=4 curl https://localai.io/install.sh | sh
#
# To uninstall:
# curl https://localai.io/install.sh | sh -s -- --uninstall
#
# Environment Variables:
# DOCKER_INSTALL - Set to "true" to install Docker images (default: auto-detected)
# USE_AIO - Set to "true" to use the all-in-one LocalAI image (default: false)
# USE_VULKAN - Set to "true" to use Vulkan GPU support (default: false)
# API_KEY - API key for securing LocalAI access (default: none)
# PORT - Port to run LocalAI on (default: 8080)
# THREADS - Number of CPU threads to use (default: auto-detected)
# MODELS_PATH - Path to store models (default: /var/lib/local-ai/models)
# CORE_IMAGES - Set to "true" to download core LocalAI images (default: false)
# P2P_TOKEN - Token for P2P federation/worker mode (default: none)
# WORKER - Set to "true" to run as a worker node (default: false)
# FEDERATED - Set to "true" to enable federation mode (default: false)
# FEDERATED_SERVER - Set to "true" to run as a federation server (default: false)
set -e
set -o noglob
#set -x
# --- helper functions for logs ---
# ANSI escape codes
LIGHT_BLUE='\033[38;5;117m'
ORANGE='\033[38;5;214m'
RED='\033[38;5;196m'
BOLD='\033[1m'
RESET='\033[0m'
ECHO=`which echo || true`
if [ -z "$ECHO" ]; then
ECHO=echo
else
ECHO="$ECHO -e"
fi
info()
{
${ECHO} "${BOLD}${LIGHT_BLUE}" '[INFO] ' "$@" "${RESET}"
}
warn()
{
${ECHO} "${BOLD}${ORANGE}" '[WARN] ' "$@" "${RESET}" >&2
}
fatal()
{
${ECHO} "${BOLD}${RED}" '[ERROR] ' "$@" "${RESET}" >&2
exit 1
}
# --- custom choice functions ---
# like the logging functions, but with the -n flag to prevent the new line and keep the cursor in line for choices inputs like y/n
choice_info()
{
${ECHO} -n "${BOLD}${LIGHT_BLUE}" '[INFO] ' "$@" "${RESET}"
}
choice_warn()
{
${ECHO} -n "${BOLD}${ORANGE}" '[WARN] ' "$@" "${RESET}" >&2
}
choice_fatal()
{
${ECHO} -n "${BOLD}${RED}" '[ERROR] ' "$@" "${RESET}" >&2
exit 1
}
# --- fatal if no systemd or openrc ---
verify_system() {
if [ -x /sbin/openrc-run ]; then
HAS_OPENRC=true
return
fi
if [ -x /bin/systemctl ] || type systemctl > /dev/null 2>&1; then
HAS_SYSTEMD=true
return
fi
fatal 'Can not find systemd or openrc to use as a process supervisor for local-ai.'
}
TEMP_DIR=$(mktemp -d)
cleanup() { rm -rf $TEMP_DIR; }
trap cleanup EXIT
available() { command -v $1 >/dev/null; }
require() {
local MISSING=''
for TOOL in $*; do
if ! available $TOOL; then
MISSING="$MISSING $TOOL"
fi
done
echo $MISSING
}
# Function to uninstall LocalAI
uninstall_localai() {
info "Starting LocalAI uninstallation..."
# Stop and remove Docker container if it exists
if available docker && $SUDO docker ps -a --format '{{.Names}}' | grep -q local-ai; then
info "Stopping and removing LocalAI Docker container..."
$SUDO docker stop local-ai || true
$SUDO docker rm local-ai || true
$SUDO docker volume rm local-ai-data || true
fi
# Remove systemd service if it exists
if [ -f "/etc/systemd/system/local-ai.service" ]; then
info "Removing systemd service..."
$SUDO systemctl stop local-ai || true
$SUDO systemctl disable local-ai || true
$SUDO rm -f /etc/systemd/system/local-ai.service
$SUDO systemctl daemon-reload
fi
# Remove environment file
if [ -f "/etc/localai.env" ]; then
info "Removing environment file..."
$SUDO rm -f /etc/localai.env
fi
# Remove binary
for BINDIR in /usr/local/bin /usr/bin /bin; do
if [ -f "$BINDIR/local-ai" ]; then
info "Removing binary from $BINDIR..."
$SUDO rm -f "$BINDIR/local-ai"
fi
done
# Remove local-ai user and all its data if it exists
if id local-ai >/dev/null 2>&1; then
info "Removing local-ai user and all its data..."
$SUDO gpasswd -d $(whoami) local-ai
$SUDO userdel -r local-ai || true
fi
info "LocalAI has been successfully uninstalled."
exit 0
}
## VARIABLES
# DOCKER_INSTALL - set to "true" to install Docker images
# USE_AIO - set to "true" to install the all-in-one LocalAI image
# USE_VULKAN - set to "true" to use Vulkan GPU support
PORT=${PORT:-8080}
docker_found=false
if available docker ; then
info "Docker detected."
docker_found=true
if [ -z $DOCKER_INSTALL ]; then
info "Docker detected and no installation method specified. Using Docker."
fi
fi
DOCKER_INSTALL=${DOCKER_INSTALL:-$docker_found}
USE_AIO=${USE_AIO:-false}
USE_VULKAN=${USE_VULKAN:-false}
API_KEY=${API_KEY:-}
CORE_IMAGES=${CORE_IMAGES:-false}
P2P_TOKEN=${P2P_TOKEN:-}
WORKER=${WORKER:-false}
FEDERATED=${FEDERATED:-false}
FEDERATED_SERVER=${FEDERATED_SERVER:-false}
# nprocs -1
if available nproc; then
procs=$(nproc)
else
procs=1
fi
THREADS=${THREADS:-$procs}
LATEST_VERSION=$(curl -s "https://api.github.com/repos/mudler/LocalAI/releases/latest" | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')
LOCALAI_VERSION="${LOCALAI_VERSION:-$LATEST_VERSION}" #changed due to VERSION beign already defined in Fedora 42 Cloud Edition
MODELS_PATH=${MODELS_PATH:-/var/lib/local-ai/models}
check_gpu() {
# Look for devices based on vendor ID for NVIDIA and AMD
case $1 in
lspci)
case $2 in
nvidia) available lspci && lspci -d '10de:' | grep -q 'NVIDIA' || return 1 ;;
amdgpu) available lspci && lspci -d '1002:' | grep -q 'AMD' || return 1 ;;
intel) available lspci && lspci | grep -E 'VGA|3D' | grep -iq intel | return 1 ;;
esac ;;
lshw)
case $2 in
nvidia) available lshw && $SUDO lshw -c display -numeric | grep -q 'vendor: .* \[10DE\]' || return 1 ;;
amdgpu) available lshw && $SUDO lshw -c display -numeric | grep -q 'vendor: .* \[1002\]' || return 1 ;;
intel) available lshw && $SUDO lshw -c display -numeric | grep -q 'vendor: .* \[8086\]' || return 1 ;;
esac ;;
nvidia-smi) available nvidia-smi || return 1 ;;
esac
}
install_success() {
info "The LocalAI API is now available at 127.0.0.1:$PORT."
if [ "$DOCKER_INSTALL" = "true" ]; then
info "The LocalAI Docker container is now running."
else
info 'Install complete. Run "local-ai" from the command line.'
fi
}
aborted() {
warn 'Installation aborted.'
exit 1
}
trap aborted INT
configure_systemd() {
if ! id local-ai >/dev/null 2>&1; then
info "Creating local-ai user..."
$SUDO useradd -r -s /bin/false -U -M -d /var/lib/local-ai local-ai
$SUDO mkdir -p /var/lib/local-ai
$SUDO chmod 0755 /var/lib/local-ai
$SUDO chown local-ai:local-ai /var/lib/local-ai
fi
info "Adding current user to local-ai group..."
$SUDO usermod -a -G local-ai $(whoami)
info "Creating local-ai systemd service..."
cat <<EOF | $SUDO tee /etc/systemd/system/local-ai.service >/dev/null
[Unit]
Description=LocalAI Service
After=network-online.target
[Service]
ExecStart=$BINDIR/local-ai $STARTCOMMAND
User=local-ai
Group=local-ai
Restart=always
EnvironmentFile=/etc/localai.env
RestartSec=3
Environment="PATH=$PATH"
WorkingDirectory=/var/lib/local-ai
[Install]
WantedBy=default.target
EOF
$SUDO touch /etc/localai.env
$SUDO echo "ADDRESS=0.0.0.0:$PORT" | $SUDO tee /etc/localai.env >/dev/null
$SUDO echo "API_KEY=$API_KEY" | $SUDO tee -a /etc/localai.env >/dev/null
$SUDO echo "THREADS=$THREADS" | $SUDO tee -a /etc/localai.env >/dev/null
$SUDO echo "MODELS_PATH=$MODELS_PATH" | $SUDO tee -a /etc/localai.env >/dev/null
if [ -n "$P2P_TOKEN" ]; then
$SUDO echo "LOCALAI_P2P_TOKEN=$P2P_TOKEN" | $SUDO tee -a /etc/localai.env >/dev/null
$SUDO echo "LOCALAI_P2P=true" | $SUDO tee -a /etc/localai.env >/dev/null
fi
if [ "$LOCALAI_P2P_DISABLE_DHT" = true ]; then
$SUDO echo "LOCALAI_P2P_DISABLE_DHT=true" | $SUDO tee -a /etc/localai.env >/dev/null
fi
SYSTEMCTL_RUNNING="$(systemctl is-system-running || true)"
case $SYSTEMCTL_RUNNING in
running|degraded)
info "Enabling and starting local-ai service..."
$SUDO systemctl daemon-reload
$SUDO systemctl enable local-ai
start_service() { $SUDO systemctl restart local-ai; }
trap start_service EXIT
;;
esac
}
# ref: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-yum-or-dnf
install_container_toolkit_yum() {
info 'Installing NVIDIA container toolkit repository...'
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
$SUDO tee /etc/yum.repos.d/nvidia-container-toolkit.repo
if [ "$PACKAGE_MANAGER" = "dnf" ]; then
DNF_VERSION=$($PACKAGE_MANAGER --version | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' | head -n1 | cut -d. -f1)
if [ "$DNF_VERSION" -ge 5 ]; then
# DNF5: Use 'setopt' to enable the repository
$SUDO $PACKAGE_MANAGER config-manager setopt nvidia-container-toolkit-experimental.enabled=1
else
# DNF4: Use '--set-enabled' to enable the repository
$SUDO $PACKAGE_MANAGER config-manager --enable nvidia-container-toolkit-experimental
fi
else
$SUDO $PACKAGE_MANAGER -y install yum-utils
$SUDO $PACKAGE_MANAGER-config-manager --enable nvidia-container-toolkit-experimental
fi
$SUDO $PACKAGE_MANAGER install -y nvidia-container-toolkit
}
# Fedora, Rhel and other distro ships tunable SELinux booleans in the container-selinux policy to control device access.
# In particular, enabling container_use_devices allows containers to use arbitrary host device labels (including GPU devices)
# ref: https://github.com/containers/ramalama/blob/main/docs/ramalama-cuda.7.md#expected-output
enable_selinux_container_booleans() {
# Check SELinux mode
SELINUX_MODE=$(getenforce)
if [ "$SELINUX_MODE" == "Enforcing" ]; then
# Check the status of container_use_devices
CONTAINER_USE_DEVICES=$(getsebool container_use_devices | awk '{print $3}')
if [ "$CONTAINER_USE_DEVICES" == "off" ]; then
#We want to give the user the choice to enable the SE booleans since it is a security config
warn "+-----------------------------------------------------------------------------------------------------------+"
warn "| WARNING: |"
warn "| Your distribution ships tunable SELinux booleans in the container-selinux policy to control device access.|"
warn "| In particular, enabling \"container_use_devices\" allows containers to use arbitrary host device labels |"
warn "| (including GPU devices). |"
warn "| This script can try to enable them enabling the \"container_use_devices\" flag. |"
warn "| |"
warn "| Otherwise you can exit the install script and enable them yourself. |"
warn "+-----------------------------------------------------------------------------------------------------------+"
while true; do
choice_warn "I understand that this script is going to change my SELinux configs, which is a security risk: (yes/exit) ";
read Answer
if [ "$Answer" = "yes" ]; then
warn "Enabling \"container_use_devices\" persistently..."
$SUDO setsebool -P container_use_devices 1
break
elif [ "$Answer" = "exit" ]; then
aborted
else
warn "Invalid choice. Please enter 'yes' or 'exit'."
fi
done
fi
fi
}
# ref: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-apt
install_container_toolkit_apt() {
info 'Installing NVIDIA container toolkit repository...'
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | $SUDO gpg --dearmor -o /etc/apt/trusted.gpg.d/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
$SUDO tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
$SUDO apt-get update && $SUDO apt-get install -y nvidia-container-toolkit
}
# ref: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-zypper
install_container_toolkit_zypper() {
info 'Installing NVIDIA zypper repository...'
$SUDO zypper ar https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
$SUDO zypper modifyrepo --enable nvidia-container-toolkit-experimental
$SUDO zypper --gpg-auto-import-keys install -y nvidia-container-toolkit
}
install_container_toolkit() {
if [ ! -f "/etc/os-release" ]; then
fatal "Unknown distribution. Skipping CUDA installation."
fi
## Check if it's already installed
if check_gpu nvidia-smi && available nvidia-container-runtime; then
info "NVIDIA Container Toolkit already installed."
return
fi
. /etc/os-release
OS_NAME=$ID
OS_VERSION=$VERSION_ID
info "Installing NVIDIA Container Toolkit..."
case $OS_NAME in
amzn|fedora|rocky|centos|rhel) install_container_toolkit_yum ;;
debian|ubuntu) install_container_toolkit_apt ;;
opensuse*|suse*) install_container_toolkit_zypper ;;
*) echo "Could not install nvidia container toolkit - unknown OS" ;;
esac
# after installing the toolkit we need to add it to the docker runtimes, otherwise even with --gpu all
# the container would still run with runc and would not have access to nvidia-smi
# ref: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuring-docker
info "Adding NVIDIA Container Runtime to Docker runtimes..."
$SUDO nvidia-ctk runtime configure --runtime=docker
info "Restarting Docker Daemon"
$SUDO systemctl restart docker
# The NVML error arises because SELinux blocked the container's attempts to open the GPU devices or related libraries.
# Without relaxing SELinux for the container, GPU commands like nvidia-smi report "Insufficient Permissions"
# This has been noted in NVIDIA's documentation:
# ref: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/1.13.5/install-guide.html#id2
# ref: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/troubleshooting.html#nvml-insufficient-permissions-and-selinux
case $OS_NAME in
fedora|rhel|centos|rocky)
enable_selinux_container_booleans
;;
opensuse-tumbleweed)
enable_selinux_container_booleans
;;
esac
}
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-7-centos-7
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-8-rocky-8
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-9-rocky-9
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#fedora
install_cuda_driver_yum() {
info 'Installing NVIDIA CUDA repository...'
case $PACKAGE_MANAGER in
yum)
$SUDO $PACKAGE_MANAGER -y install yum-utils
$SUDO $PACKAGE_MANAGER-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m)/cuda-$1$2.repo
;;
dnf)
DNF_VERSION=$($PACKAGE_MANAGER --version | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' | head -n1 | cut -d. -f1)
if [ "$DNF_VERSION" -ge 5 ]; then
# DNF5: Use 'addrepo' to add the repository
$SUDO $PACKAGE_MANAGER config-manager addrepo --id=nvidia-cuda --set=name="nvidia-cuda" --set=baseurl="https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m)/cuda-$1$2.repo"
else
# DNF4: Use '--add-repo' to add the repository
$SUDO $PACKAGE_MANAGER config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m)/cuda-$1$2.repo
fi
;;
esac
case $1 in
rhel)
info 'Installing EPEL repository...'
# EPEL is required for third-party dependencies such as dkms and libvdpau
$SUDO $PACKAGE_MANAGER -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-$2.noarch.rpm || true
;;
esac
info 'Installing CUDA driver...'
if [ "$1" = 'centos' ] || [ "$1$2" = 'rhel7' ]; then
$SUDO $PACKAGE_MANAGER -y install nvidia-driver-latest-dkms
fi
$SUDO $PACKAGE_MANAGER -y install cuda-drivers
}
install_fedora_nvidia_kernel_drivers(){
#We want to give the user the choice to install the akmod kernel drivers or not, since it could break some setups
warn "+------------------------------------------------------------------------------------------------+"
warn "| WARNING: |"
warn "| Looks like the NVIDIA Kernel modules are not installed. |"
warn "| |"
warn "| This script can try to install them using akmod-nvidia. |"
warn "| - The script need the rpmfusion free and nonfree repos and will install them if not available. |"
warn "| - The akmod installation can sometimes inhibit the reboot command. |"
warn "| |"
warn "| Otherwise you can exit the install script and install them yourself. |"
warn "| NOTE: you will need to reboot after the installation. |"
warn "+------------------------------------------------------------------------------------------------+"
while true; do
choice_warn "Do you wish for the script to try and install them? (akmod/exit) ";
read Answer
if [ "$Answer" = "akmod" ]; then
DNF_VERSION=$($PACKAGE_MANAGER --version | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' | head -n1 | cut -d. -f1)
OS_NAME=$ID
OS_VERSION=$VERSION_ID
FREE_URL="https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-${OS_VERSION}.noarch.rpm"
NONFREE_URL="https://mirrors.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-${OS_VERSION}.noarch.rpm"
curl -LO "$FREE_URL"
curl -LO "$NONFREE_URL"
if [ "$DNF_VERSION" -ge 5 ]; then
# DNF5:
$SUDO $PACKAGE_MANAGER install -y "rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm" "rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm"
$SUDO $PACKAGE_MANAGER install -y akmod-nvidia
else
# DNF4:
$SUDO $PACKAGE_MANAGER install -y "rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm" "rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm"
$SUDO $PACKAGE_MANAGER install -y akmod-nvidia
fi
$SUDO rm "rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm"
$SUDO rm "rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm"
install_cuda_driver_yum $OS_NAME '41'
info "Nvidia driver installation complete, please reboot now and run the Install script again to complete the setup."
exit
elif [ "$Answer" = "exit" ]; then
aborted
else
warn "Invalid choice. Please enter 'akmod' or 'exit'."
fi
done
}
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu
# ref: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#debian
install_cuda_driver_apt() {
info 'Installing NVIDIA CUDA repository...'
curl -fsSL -o $TEMP_DIR/cuda-keyring.deb https://developer.download.nvidia.com/compute/cuda/repos/$1$2/$(uname -m)/cuda-keyring_1.1-1_all.deb
case $1 in
debian)
info 'Enabling contrib sources...'
$SUDO sed 's/main/contrib/' < /etc/apt/sources.list | $SUDO tee /etc/apt/sources.list.d/contrib.list > /dev/null
if [ -f "/etc/apt/sources.list.d/debian.sources" ]; then
$SUDO sed 's/main/contrib/' < /etc/apt/sources.list.d/debian.sources | $SUDO tee /etc/apt/sources.list.d/contrib.sources > /dev/null
fi
;;
esac
info 'Installing CUDA driver...'
$SUDO dpkg -i $TEMP_DIR/cuda-keyring.deb
$SUDO apt-get update
[ -n "$SUDO" ] && SUDO_E="$SUDO -E" || SUDO_E=
DEBIAN_FRONTEND=noninteractive $SUDO_E apt-get -y install cuda-drivers -q
}
install_cuda() {
if [ ! -f "/etc/os-release" ]; then
fatal "Unknown distribution. Skipping CUDA installation."
fi
. /etc/os-release
OS_NAME=$ID
OS_VERSION=$VERSION_ID
if [ -z "$PACKAGE_MANAGER" ]; then
fatal "Unknown package manager. Skipping CUDA installation."
fi
if ! check_gpu nvidia-smi || [ -z "$(nvidia-smi | grep -o "CUDA Version: [0-9]*\.[0-9]*")" ]; then
case $OS_NAME in
centos|rhel) install_cuda_driver_yum 'rhel' $(echo $OS_VERSION | cut -d '.' -f 1) ;;
rocky) install_cuda_driver_yum 'rhel' $(echo $OS_VERSION | cut -c1) ;;
fedora) [ $OS_VERSION -lt '41' ] && install_cuda_driver_yum $OS_NAME $OS_VERSION || install_cuda_driver_yum $OS_NAME '41';;
amzn) install_cuda_driver_yum 'fedora' '37' ;;
debian) install_cuda_driver_apt $OS_NAME $OS_VERSION ;;
ubuntu) install_cuda_driver_apt $OS_NAME $(echo $OS_VERSION | sed 's/\.//') ;;
*) exit ;;
esac
fi
if ! lsmod | grep -q nvidia || ! lsmod | grep -q nvidia_uvm; then
KERNEL_RELEASE="$(uname -r)"
case $OS_NAME in
rocky) $SUDO $PACKAGE_MANAGER -y install kernel-devel kernel-headers ;;
centos|rhel|amzn) $SUDO $PACKAGE_MANAGER -y install kernel-devel-$KERNEL_RELEASE kernel-headers-$KERNEL_RELEASE ;;
fedora) $SUDO $PACKAGE_MANAGER -y install kernel-devel-$KERNEL_RELEASE ;;
debian|ubuntu) $SUDO apt-get -y install linux-headers-$KERNEL_RELEASE ;;
*) exit ;;
esac
NVIDIA_CUDA_VERSION=$($SUDO dkms info | awk -F: '/added/ { print $1 }')
if [ -n "$NVIDIA_CUDA_VERSION" ]; then
$SUDO dkms install $NVIDIA_CUDA_VERSION
fi
if lsmod | grep -q nouveau; then
info 'Reboot to complete NVIDIA CUDA driver install.'
exit 0
fi
$SUDO modprobe nvidia
$SUDO modprobe nvidia_uvm
fi
# make sure the NVIDIA modules are loaded on boot with nvidia-persistenced
if command -v nvidia-persistenced > /dev/null 2>&1; then
$SUDO touch /etc/modules-load.d/nvidia.conf
MODULES="nvidia nvidia-uvm"
for MODULE in $MODULES; do
if ! grep -qxF "$MODULE" /etc/modules-load.d/nvidia.conf; then
echo "$MODULE" | sudo tee -a /etc/modules-load.d/nvidia.conf > /dev/null
fi
done
fi
info "NVIDIA GPU ready."
install_success
}
install_amd() {
# Look for pre-existing ROCm v6 before downloading the dependencies
for search in "${HIP_PATH:-''}" "${ROCM_PATH:-''}" "/opt/rocm" "/usr/lib64"; do
if [ -n "${search}" ] && [ -e "${search}/libhipblas.so.2" -o -e "${search}/lib/libhipblas.so.2" ]; then
info "Compatible AMD GPU ROCm library detected at ${search}"
install_success
exit 0
fi
done
info "AMD GPU ready."
exit 0
}
install_docker() {
[ "$(uname -s)" = "Linux" ] || fatal 'This script is intended to run on Linux only.'
if ! available docker; then
info "Installing Docker..."
curl -fsSL https://get.docker.com | sh
fi
# Check docker is running
if ! $SUDO systemctl is-active --quiet docker; then
info "Starting Docker..."
$SUDO systemctl start docker
fi
info "Creating LocalAI Docker volume..."
# Create volume if doesn't exist already
if ! $SUDO docker volume inspect local-ai-data > /dev/null 2>&1; then
$SUDO docker volume create local-ai-data
fi
# Check if container is already running
if $SUDO docker ps -a --format '{{.Names}}' | grep -q local-ai; then
info "LocalAI Docker container already exists, replacing it..."
$SUDO docker rm -f local-ai
fi
envs=""
if [ -n "$P2P_TOKEN" ]; then
envs="-e LOCALAI_P2P_TOKEN=$P2P_TOKEN -e LOCALAI_P2P=true"
fi
if [ "$LOCALAI_P2P_DISABLE_DHT" = true ]; then
envs="$envs -e LOCALAI_P2P_DISABLE_DHT=true"
fi
IMAGE_TAG=
if [ "$USE_VULKAN" = true ]; then
IMAGE_TAG=${LOCALAI_VERSION}-gpu-vulkan
info "Starting LocalAI Docker container..."
$SUDO docker run -v local-ai-data:/models \
--device /dev/dri \
--restart=always \
-e API_KEY=$API_KEY \
-e THREADS=$THREADS \
$envs \
-d -p $PORT:8080 --name local-ai localai/localai:$IMAGE_TAG $STARTCOMMAND
elif [ "$HAS_CUDA" ]; then
# Default to CUDA 12
IMAGE_TAG=${LOCALAI_VERSION}-gpu-nvidia-cuda-12
# AIO
if [ "$USE_AIO" = true ]; then
IMAGE_TAG=${LOCALAI_VERSION}-aio-gpu-nvidia-cuda-12
fi
info "Checking Nvidia Kernel Drivers presence..."
if ! available nvidia-smi; then
OS_NAME=$ID
OS_VERSION=$VERSION_ID
case $OS_NAME in
debian|ubuntu) $SUDO apt-get -y install nvidia-cuda-toolkit;;
fedora) install_fedora_nvidia_kernel_drivers;;
esac
fi
info "Starting LocalAI Docker container..."
$SUDO docker run -v local-ai-data:/models \
--gpus all \
--restart=always \
-e API_KEY=$API_KEY \
-e THREADS=$THREADS \
$envs \
-d -p $PORT:8080 --name local-ai localai/localai:$IMAGE_TAG $STARTCOMMAND
elif [ "$HAS_AMD" ]; then
IMAGE_TAG=${LOCALAI_VERSION}-gpu-hipblas
# AIO
if [ "$USE_AIO" = true ]; then
IMAGE_TAG=${LOCALAI_VERSION}-aio-gpu-hipblas
fi
info "Starting LocalAI Docker container..."
$SUDO docker run -v local-ai-data:/models \
--device /dev/dri \
--device /dev/kfd \
--group-add=video \
--restart=always \
-e API_KEY=$API_KEY \
-e THREADS=$THREADS \
$envs \
-d -p $PORT:8080 --name local-ai localai/localai:$IMAGE_TAG $STARTCOMMAND
elif [ "$HAS_INTEL" ]; then
IMAGE_TAG=${LOCALAI_VERSION}-gpu-intel
# AIO
if [ "$USE_AIO" = true ]; then
IMAGE_TAG=${LOCALAI_VERSION}-aio-gpu-intel
fi
info "Starting LocalAI Docker container..."
$SUDO docker run -v local-ai-data:/models \
--device /dev/dri \
--restart=always \
-e API_KEY=$API_KEY \
-e THREADS=$THREADS \
$envs \
-d -p $PORT:8080 --name local-ai localai/localai:$IMAGE_TAG $STARTCOMMAND
else
IMAGE_TAG=${LOCALAI_VERSION}
# AIO
if [ "$USE_AIO" = true ]; then
IMAGE_TAG=${LOCALAI_VERSION}-aio-cpu
fi
info "Starting LocalAI Docker container..."
$SUDO docker run -v local-ai-data:/models \
--restart=always \
-e MODELS_PATH=/models \
-e API_KEY=$API_KEY \
-e THREADS=$THREADS \
$envs \
-d -p $PORT:8080 --name local-ai localai/localai:$IMAGE_TAG $STARTCOMMAND
fi
install_success
exit 0
}
install_binary_darwin() {
[ "$(uname -s)" = "Darwin" ] || fatal 'This script is intended to run on macOS only.'
info "Downloading LocalAI ${LOCALAI_VERSION}..."
curl --fail --show-error --location --progress-bar -o $TEMP_DIR/local-ai "https://github.com/mudler/LocalAI/releases/download/${LOCALAI_VERSION}/local-ai-${LOCALAI_VERSION}-darwin-${ARCH}"
info "Installing to /usr/local/bin/local-ai"
install -o0 -g0 -m755 $TEMP_DIR/local-ai /usr/local/bin/local-ai
install_success
}
install_binary() {
[ "$(uname -s)" = "Linux" ] || fatal 'This script is intended to run on Linux only.'
IS_WSL2=false
KERN=$(uname -r)
case "$KERN" in
*icrosoft*WSL2 | *icrosoft*wsl2) IS_WSL2=true;;
*icrosoft) fatal "Microsoft WSL1 is not currently supported. Please upgrade to WSL2 with 'wsl --set-version <distro> 2'" ;;
*) ;;
esac
NEEDS=$(require curl awk grep sed tee xargs)
if [ -n "$NEEDS" ]; then
info "ERROR: The following tools are required but missing:"
for NEED in $NEEDS; do
echo " - $NEED"
done
exit 1
fi
info "Downloading LocalAI ${LOCALAI_VERSION}..."
curl --fail --location --progress-bar -o $TEMP_DIR/local-ai "https://github.com/mudler/LocalAI/releases/download/${LOCALAI_VERSION}/local-ai-${LOCALAI_VERSION}-linux-${ARCH}"
for BINDIR in /usr/local/bin /usr/bin /bin; do
echo $PATH | grep -q $BINDIR && break || continue
done
info "Installing LocalAI as local-ai to $BINDIR..."
$SUDO install -o0 -g0 -m755 -d $BINDIR
$SUDO install -o0 -g0 -m755 $TEMP_DIR/local-ai $BINDIR/local-ai
verify_system
if [ "$HAS_SYSTEMD" = true ]; then
configure_systemd
fi
# WSL2 only supports GPUs via nvidia passthrough
# so check for nvidia-smi to determine if GPU is available
if [ "$IS_WSL2" = true ]; then
if available nvidia-smi && [ -n "$(nvidia-smi | grep -o "CUDA Version: [0-9]*\.[0-9]*")" ]; then
info "Nvidia GPU detected."
fi
install_success
exit 0
fi
# Install GPU dependencies on Linux
if ! available lspci && ! available lshw; then
warn "Unable to detect NVIDIA/AMD GPU. Install lspci or lshw to automatically detect and install GPU dependencies."
exit 0
fi
if [ "$HAS_AMD" = true ]; then
install_amd
fi
if [ "$HAS_CUDA" = true ]; then
if check_gpu nvidia-smi; then
info "NVIDIA GPU installed."
exit 0
fi
install_cuda
fi
install_success
warn "No NVIDIA/AMD GPU detected. LocalAI will run in CPU-only mode."
exit 0
}
detect_start_command() {
STARTCOMMAND="run"
if [ "$WORKER" = true ]; then
if [ -n "$P2P_TOKEN" ]; then
STARTCOMMAND="worker p2p-llama-cpp-rpc"
else
STARTCOMMAND="worker llama-cpp-rpc"
fi
elif [ "$FEDERATED" = true ]; then
if [ "$FEDERATED_SERVER" = true ]; then
STARTCOMMAND="federated"
else
STARTCOMMAND="$STARTCOMMAND --p2p --federated"
fi
elif [ -n "$P2P_TOKEN" ]; then
STARTCOMMAND="$STARTCOMMAND --p2p"
fi
}
SUDO=
if [ "$(id -u)" -ne 0 ]; then
# Running as root, no need for sudo
if ! available sudo; then
fatal "This script requires superuser permissions. Please re-run as root."
fi
SUDO="sudo"
fi
# Check if uninstall flag is provided
if [ "$1" = "--uninstall" ]; then
uninstall_localai
fi
detect_start_command
OS="$(uname -s)"
ARCH=$(uname -m)
case "$ARCH" in
x86_64) ARCH="amd64" ;;
aarch64|arm64) ARCH="arm64" ;;
*) fatal "Unsupported architecture: $ARCH" ;;
esac
if [ "$OS" = "Darwin" ]; then
install_binary_darwin
exit 0
fi
if check_gpu lspci amdgpu || check_gpu lshw amdgpu; then
HAS_AMD=true
fi
if check_gpu lspci nvidia || check_gpu lshw nvidia; then
HAS_CUDA=true
fi
if check_gpu lspci intel || check_gpu lshw intel; then
HAS_INTEL=true
fi
PACKAGE_MANAGER=
for PACKAGE_MANAGER in dnf yum apt-get; do
if available $PACKAGE_MANAGER; then
break
fi
done
if [ "$DOCKER_INSTALL" = "true" ]; then
info "Installing LocalAI from container images"
if [ "$HAS_CUDA" = true ]; then
install_container_toolkit
fi
install_docker
else
info "Installing LocalAI from binaries"
install_binary
fi

10
go.mod
View File

@@ -6,10 +6,10 @@ toolchain go1.24.5
require ( require (
dario.cat/mergo v1.0.2 dario.cat/mergo v1.0.2
fyne.io/fyne/v2 v2.7.2 fyne.io/fyne/v2 v2.7.3
github.com/Masterminds/sprig/v3 v3.3.0 github.com/Masterminds/sprig/v3 v3.3.0
github.com/alecthomas/kong v1.14.0 github.com/alecthomas/kong v1.14.0
github.com/anthropics/anthropic-sdk-go v1.22.0 github.com/anthropics/anthropic-sdk-go v1.26.0
github.com/charmbracelet/glamour v0.10.0 github.com/charmbracelet/glamour v0.10.0
github.com/containerd/containerd v1.7.30 github.com/containerd/containerd v1.7.30
github.com/dhowden/tag v0.0.0-20240417053706-3d75831295e8 github.com/dhowden/tag v0.0.0-20240417053706-3d75831295e8
@@ -21,7 +21,7 @@ require (
github.com/gofrs/flock v0.13.0 github.com/gofrs/flock v0.13.0
github.com/google/go-containerregistry v0.20.7 github.com/google/go-containerregistry v0.20.7
github.com/google/uuid v1.6.0 github.com/google/uuid v1.6.0
github.com/gpustack/gguf-parser-go v0.23.1 github.com/gpustack/gguf-parser-go v0.24.0
github.com/hpcloud/tail v1.0.0 github.com/hpcloud/tail v1.0.0
github.com/ipfs/go-log v1.0.5 github.com/ipfs/go-log v1.0.5
github.com/jaypipes/ghw v0.23.0 github.com/jaypipes/ghw v0.23.0
@@ -33,7 +33,7 @@ require (
github.com/mholt/archiver/v3 v3.5.1 github.com/mholt/archiver/v3 v3.5.1
github.com/microcosm-cc/bluemonday v1.0.27 github.com/microcosm-cc/bluemonday v1.0.27
github.com/modelcontextprotocol/go-sdk v1.3.0 github.com/modelcontextprotocol/go-sdk v1.3.0
github.com/mudler/cogito v0.9.1-0.20260217143801-bb7f986ed2c7 github.com/mudler/cogito v0.9.1
github.com/mudler/edgevpn v0.31.1 github.com/mudler/edgevpn v0.31.1
github.com/mudler/go-processmanager v0.1.0 github.com/mudler/go-processmanager v0.1.0
github.com/mudler/memory v0.0.0-20251216220809-d1256471a6c2 github.com/mudler/memory v0.0.0-20251216220809-d1256471a6c2
@@ -101,7 +101,7 @@ require (
github.com/go-gl/glfw/v3.3/glfw v0.0.0-20240506104042-037f3cc74f2a // indirect github.com/go-gl/glfw/v3.3/glfw v0.0.0-20240506104042-037f3cc74f2a // indirect
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/go-text/render v0.2.0 // indirect github.com/go-text/render v0.2.0 // indirect
github.com/go-text/typesetting v0.2.1 // indirect github.com/go-text/typesetting v0.3.3 // indirect
github.com/godbus/dbus/v5 v5.1.0 // indirect github.com/godbus/dbus/v5 v5.1.0 // indirect
github.com/google/jsonschema-go v0.4.2 // indirect github.com/google/jsonschema-go v0.4.2 // indirect
github.com/hack-pad/go-indexeddb v0.3.2 // indirect github.com/hack-pad/go-indexeddb v0.3.2 // indirect

28
go.sum
View File

@@ -8,8 +8,8 @@ dmitri.shuralyov.com/app/changes v0.0.0-20180602232624-0a106ad413e3/go.mod h1:Yl
dmitri.shuralyov.com/html/belt v0.0.0-20180602232347-f7d459c86be0/go.mod h1:JLBrvjyP0v+ecvNYvCpyZgu5/xkfAUhi6wJj28eUfSU= dmitri.shuralyov.com/html/belt v0.0.0-20180602232347-f7d459c86be0/go.mod h1:JLBrvjyP0v+ecvNYvCpyZgu5/xkfAUhi6wJj28eUfSU=
dmitri.shuralyov.com/service/change v0.0.0-20181023043359-a85b471d5412/go.mod h1:a1inKt/atXimZ4Mv927x+r7UpyzRUf4emIoiiSC2TN4= dmitri.shuralyov.com/service/change v0.0.0-20181023043359-a85b471d5412/go.mod h1:a1inKt/atXimZ4Mv927x+r7UpyzRUf4emIoiiSC2TN4=
dmitri.shuralyov.com/state v0.0.0-20180228185332-28bcc343414c/go.mod h1:0PRwlb0D6DFvNNtx+9ybjezNCa8XF0xaYcETyp6rHWU= dmitri.shuralyov.com/state v0.0.0-20180228185332-28bcc343414c/go.mod h1:0PRwlb0D6DFvNNtx+9ybjezNCa8XF0xaYcETyp6rHWU=
fyne.io/fyne/v2 v2.7.2 h1:XiNpWkn0PzX43ZCjbb0QYGg1RCxVbugwfVgikWZBCMw= fyne.io/fyne/v2 v2.7.3 h1:xBT/iYbdnNHONWO38fZMBrVBiJG8rV/Jypmy4tVfRWE=
fyne.io/fyne/v2 v2.7.2/go.mod h1:PXbqY3mQmJV3J1NRUR2VbVgUUx3vgvhuFJxyjRK/4Ug= fyne.io/fyne/v2 v2.7.3/go.mod h1:gu+dlIcZWSzKZmnrY8Fbnj2Hirabv2ek+AKsfQ2bBlw=
fyne.io/systray v1.12.0 h1:CA1Kk0e2zwFlxtc02L3QFSiIbxJ/P0n582YrZHT7aTM= fyne.io/systray v1.12.0 h1:CA1Kk0e2zwFlxtc02L3QFSiIbxJ/P0n582YrZHT7aTM=
fyne.io/systray v1.12.0/go.mod h1:RVwqP9nYMo7h5zViCBHri2FgjXF7H2cub7MAq4NSoLs= fyne.io/systray v1.12.0/go.mod h1:RVwqP9nYMo7h5zViCBHri2FgjXF7H2cub7MAq4NSoLs=
git.apache.org/thrift.git v0.0.0-20180902110319-2566ecd5d999/go.mod h1:fPE2ZNJGynbRyZ4dJvy6G277gSllfV2HJqblrnkyeyg= git.apache.org/thrift.git v0.0.0-20180902110319-2566ecd5d999/go.mod h1:fPE2ZNJGynbRyZ4dJvy6G277gSllfV2HJqblrnkyeyg=
@@ -44,8 +44,8 @@ github.com/andybalholm/brotli v1.0.1/go.mod h1:loMXtMfwqflxFJPmdbJO0a3KNoPuLBgiu
github.com/andybalholm/brotli v1.2.0 h1:ukwgCxwYrmACq68yiUqwIWnGY0cTPox/M94sVwToPjQ= github.com/andybalholm/brotli v1.2.0 h1:ukwgCxwYrmACq68yiUqwIWnGY0cTPox/M94sVwToPjQ=
github.com/andybalholm/brotli v1.2.0/go.mod h1:rzTDkvFWvIrjDXZHkuS16NPggd91W3kUSvPlQ1pLaKY= github.com/andybalholm/brotli v1.2.0/go.mod h1:rzTDkvFWvIrjDXZHkuS16NPggd91W3kUSvPlQ1pLaKY=
github.com/anmitsu/go-shlex v0.0.0-20161002113705-648efa622239/go.mod h1:2FmKhYUyUczH0OGQWaF5ceTx0UBShxjsH6f8oGKYe2c= github.com/anmitsu/go-shlex v0.0.0-20161002113705-648efa622239/go.mod h1:2FmKhYUyUczH0OGQWaF5ceTx0UBShxjsH6f8oGKYe2c=
github.com/anthropics/anthropic-sdk-go v1.22.0 h1:sgo4Ob5pC5InKCi/5Ukn5t9EjPJ7KTMaKm5beOYt6rM= github.com/anthropics/anthropic-sdk-go v1.26.0 h1:oUTzFaUpAevfuELAP1sjL6CQJ9HHAfT7CoSYSac11PY=
github.com/anthropics/anthropic-sdk-go v1.22.0/go.mod h1:WTz31rIUHUHqai2UslPpw5CwXrQP3geYBioRV4WOLvE= github.com/anthropics/anthropic-sdk-go v1.26.0/go.mod h1:qUKmaW+uuPB64iy1l+4kOSvaLqPXnHTTBKH6RVZ7q5Q=
github.com/aymanbagabas/go-osc52/v2 v2.0.1 h1:HwpRHbFMcZLEVr42D4p7XBqjyuxQH5SMiErDT4WkJ2k= github.com/aymanbagabas/go-osc52/v2 v2.0.1 h1:HwpRHbFMcZLEVr42D4p7XBqjyuxQH5SMiErDT4WkJ2k=
github.com/aymanbagabas/go-osc52/v2 v2.0.1/go.mod h1:uYgXzlJ7ZpABp8OJ+exZzJJhRNQ2ASbcXHWsFqH8hp8= github.com/aymanbagabas/go-osc52/v2 v2.0.1/go.mod h1:uYgXzlJ7ZpABp8OJ+exZzJJhRNQ2ASbcXHWsFqH8hp8=
github.com/aymanbagabas/go-udiff v0.2.0 h1:TK0fH4MteXUDspT88n8CKzvK0X9O2xu9yQjWpi6yML8= github.com/aymanbagabas/go-udiff v0.2.0 h1:TK0fH4MteXUDspT88n8CKzvK0X9O2xu9yQjWpi6yML8=
@@ -128,6 +128,8 @@ github.com/distribution/reference v0.6.0 h1:0IXCQ5g4/QMHHkarYzh5l+u8T3t73zM5Qvfr
github.com/distribution/reference v0.6.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E= github.com/distribution/reference v0.6.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E=
github.com/dlclark/regexp2 v1.11.0 h1:G/nrcoOa7ZXlpoa/91N3X7mM3r8eIlMBBJZvsz/mxKI= github.com/dlclark/regexp2 v1.11.0 h1:G/nrcoOa7ZXlpoa/91N3X7mM3r8eIlMBBJZvsz/mxKI=
github.com/dlclark/regexp2 v1.11.0/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8= github.com/dlclark/regexp2 v1.11.0/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
github.com/dnaeon/go-vcr v1.2.0 h1:zHCHvJYTMh1N7xnV7zf1m1GPBF9Ad0Jk/whtQ1663qI=
github.com/dnaeon/go-vcr v1.2.0/go.mod h1:R4UdLID7HZT3taECzJs4YgbbH6PIGXB6W/sc5OLb6RQ=
github.com/docker/cli v29.0.3+incompatible h1:8J+PZIcF2xLd6h5sHPsp5pvvJA+Sr2wGQxHkRl53a1E= github.com/docker/cli v29.0.3+incompatible h1:8J+PZIcF2xLd6h5sHPsp5pvvJA+Sr2wGQxHkRl53a1E=
github.com/docker/cli v29.0.3+incompatible/go.mod h1:JLrzqnKDaYBop7H2jaqPtU4hHvMKP+vjCwu2uszcLI8= github.com/docker/cli v29.0.3+incompatible/go.mod h1:JLrzqnKDaYBop7H2jaqPtU4hHvMKP+vjCwu2uszcLI8=
github.com/docker/distribution v2.8.3+incompatible h1:AtKxIZ36LoNK51+Z6RpzLpddBirtxJnzDrHLEKxTAYk= github.com/docker/distribution v2.8.3+incompatible h1:AtKxIZ36LoNK51+Z6RpzLpddBirtxJnzDrHLEKxTAYk=
@@ -218,10 +220,10 @@ github.com/go-task/slim-sprig/v3 v3.0.0 h1:sUs3vkvUymDpBKi3qH1YSqBQk9+9D/8M2mN1v
github.com/go-task/slim-sprig/v3 v3.0.0/go.mod h1:W848ghGpv3Qj3dhTPRyJypKRiqCdHZiAzKg9hl15HA8= github.com/go-task/slim-sprig/v3 v3.0.0/go.mod h1:W848ghGpv3Qj3dhTPRyJypKRiqCdHZiAzKg9hl15HA8=
github.com/go-text/render v0.2.0 h1:LBYoTmp5jYiJ4NPqDc2pz17MLmA3wHw1dZSVGcOdeAc= github.com/go-text/render v0.2.0 h1:LBYoTmp5jYiJ4NPqDc2pz17MLmA3wHw1dZSVGcOdeAc=
github.com/go-text/render v0.2.0/go.mod h1:CkiqfukRGKJA5vZZISkjSYrcdtgKQWRa2HIzvwNN5SU= github.com/go-text/render v0.2.0/go.mod h1:CkiqfukRGKJA5vZZISkjSYrcdtgKQWRa2HIzvwNN5SU=
github.com/go-text/typesetting v0.2.1 h1:x0jMOGyO3d1qFAPI0j4GSsh7M0Q3Ypjzr4+CEVg82V8= github.com/go-text/typesetting v0.3.3 h1:ihGNJU9KzdK2QRDy1Bm7FT5RFQoYb+3n3EIhI/4eaQc=
github.com/go-text/typesetting v0.2.1/go.mod h1:mTOxEwasOFpAMBjEQDhdWRckoLLeI/+qrQeBCTGEt6M= github.com/go-text/typesetting v0.3.3/go.mod h1:vIRUT25mLQaSh4C8H/lIsKppQz/Gdb8Pu/tNwpi52ts=
github.com/go-text/typesetting-utils v0.0.0-20241103174707-87a29e9e6066 h1:qCuYC+94v2xrb1PoS4NIDe7DGYtLnU2wWiQe9a1B1c0= github.com/go-text/typesetting-utils v0.0.0-20250618110550-c820a94c77b8 h1:4KCscI9qYWMGTuz6BpJtbUSRzcBrUSSE0ENMJbNSrFs=
github.com/go-text/typesetting-utils v0.0.0-20241103174707-87a29e9e6066/go.mod h1:DDxDdQEnB70R8owOx3LVpEFvpMK9eeH1o2r0yZhFI9o= github.com/go-text/typesetting-utils v0.0.0-20250618110550-c820a94c77b8/go.mod h1:3/62I4La/HBRX9TcTpBj4eipLiwzf+vhI+7whTc9V7o=
github.com/go-yaml/yaml v2.1.0+incompatible/go.mod h1:w2MrLa16VYP0jy6N7M5kHaCkaLENm+P+Tv+MfurjSw0= github.com/go-yaml/yaml v2.1.0+incompatible/go.mod h1:w2MrLa16VYP0jy6N7M5kHaCkaLENm+P+Tv+MfurjSw0=
github.com/goccy/go-yaml v1.18.0 h1:8W7wMFS12Pcas7KU+VVkaiCng+kG8QiFeFwzFb+rwuw= github.com/goccy/go-yaml v1.18.0 h1:8W7wMFS12Pcas7KU+VVkaiCng+kG8QiFeFwzFb+rwuw=
github.com/goccy/go-yaml v1.18.0/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA= github.com/goccy/go-yaml v1.18.0/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA=
@@ -294,8 +296,8 @@ github.com/gorilla/css v1.0.1 h1:ntNaBIghp6JmvWnxbZKANoLyuXTPZ4cAMlo6RyhlbO8=
github.com/gorilla/css v1.0.1/go.mod h1:BvnYkspnSzMmwRK+b8/xgNPLiIuNZr6vbZBTPQ2A3b0= github.com/gorilla/css v1.0.1/go.mod h1:BvnYkspnSzMmwRK+b8/xgNPLiIuNZr6vbZBTPQ2A3b0=
github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg= github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE= github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
github.com/gpustack/gguf-parser-go v0.23.1 h1:0U7DOrsi7ryx2L/dlMy+BSQ5bJV4AuMEIgGBs4RK46A= github.com/gpustack/gguf-parser-go v0.24.0 h1:tdJceXYp9e5RhE9RwVYIuUpir72Jz2D68NEtDXkKCKc=
github.com/gpustack/gguf-parser-go v0.23.1/go.mod h1:y4TwTtDqFWTK+xvprOjRUh+dowgU2TKCX37vRKvGiZ0= github.com/gpustack/gguf-parser-go v0.24.0/go.mod h1:y4TwTtDqFWTK+xvprOjRUh+dowgU2TKCX37vRKvGiZ0=
github.com/gregjones/httpcache v0.0.0-20180305231024-9cad4c3443a7/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA= github.com/gregjones/httpcache v0.0.0-20180305231024-9cad4c3443a7/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA=
github.com/grpc-ecosystem/grpc-gateway v1.5.0/go.mod h1:RSKVYQBd5MCa4OVpNdGskqpgL2+G+NZTnrVHpWWfpdw= github.com/grpc-ecosystem/grpc-gateway v1.5.0/go.mod h1:RSKVYQBd5MCa4OVpNdGskqpgL2+G+NZTnrVHpWWfpdw=
github.com/grpc-ecosystem/grpc-gateway v1.16.0 h1:gmcG1KaJ57LophUzW0Hy8NmPhnMZb4M0+kPpLofRdBo= github.com/grpc-ecosystem/grpc-gateway v1.16.0 h1:gmcG1KaJ57LophUzW0Hy8NmPhnMZb4M0+kPpLofRdBo=
@@ -509,10 +511,8 @@ github.com/morikuni/aec v1.0.0/go.mod h1:BbKIizmSmc5MMPqRYbxO4ZU0S0+P200+tUnFx7P
github.com/mr-tron/base58 v1.1.2/go.mod h1:BinMc/sQntlIE1frQmRFPUoPA1Zkr8VRgBdjWI2mNwc= github.com/mr-tron/base58 v1.1.2/go.mod h1:BinMc/sQntlIE1frQmRFPUoPA1Zkr8VRgBdjWI2mNwc=
github.com/mr-tron/base58 v1.2.0 h1:T/HDJBh4ZCPbU39/+c3rRvE0uKBQlU27+QI8LJ4t64o= github.com/mr-tron/base58 v1.2.0 h1:T/HDJBh4ZCPbU39/+c3rRvE0uKBQlU27+QI8LJ4t64o=
github.com/mr-tron/base58 v1.2.0/go.mod h1:BinMc/sQntlIE1frQmRFPUoPA1Zkr8VRgBdjWI2mNwc= github.com/mr-tron/base58 v1.2.0/go.mod h1:BinMc/sQntlIE1frQmRFPUoPA1Zkr8VRgBdjWI2mNwc=
github.com/mudler/cogito v0.8.2-0.20260214201734-da0d4ceb2b44 h1:joGszpItINnZdoL/0p2077Wz2xnxMGRSRgYN5mS7I4c= github.com/mudler/cogito v0.9.1 h1:6y7VPHSS+Q+v4slV42XcjykN5wip4N7C/rXTwWPBVFM=
github.com/mudler/cogito v0.8.2-0.20260214201734-da0d4ceb2b44/go.mod h1:6sfja3lcu2nWRzEc0wwqGNu/eCG3EWgij+8s7xyUeQ4= github.com/mudler/cogito v0.9.1/go.mod h1:6sfja3lcu2nWRzEc0wwqGNu/eCG3EWgij+8s7xyUeQ4=
github.com/mudler/cogito v0.9.1-0.20260217143801-bb7f986ed2c7 h1:z3AcM7LbaQb+C955JdSXksHB9B0uWGQpdgl05gJM+9Y=
github.com/mudler/cogito v0.9.1-0.20260217143801-bb7f986ed2c7/go.mod h1:6sfja3lcu2nWRzEc0wwqGNu/eCG3EWgij+8s7xyUeQ4=
github.com/mudler/edgevpn v0.31.1 h1:7qegiDWd0kAg6ljhNHxqvp8hbo/6BbzSdbb7/2WZfiY= github.com/mudler/edgevpn v0.31.1 h1:7qegiDWd0kAg6ljhNHxqvp8hbo/6BbzSdbb7/2WZfiY=
github.com/mudler/edgevpn v0.31.1/go.mod h1:ftV5B0nKFzm4R8vR80UYnCb2nf7lxCRgAALxUEEgCf8= github.com/mudler/edgevpn v0.31.1/go.mod h1:ftV5B0nKFzm4R8vR80UYnCb2nf7lxCRgAALxUEEgCf8=
github.com/mudler/go-piper v0.0.0-20241023091659-2494246fd9fc h1:RxwneJl1VgvikiX28EkpdAyL4yQVnJMrbquKospjHyA= github.com/mudler/go-piper v0.0.0-20241023091659-2494246fd9fc h1:RxwneJl1VgvikiX28EkpdAyL4yQVnJMrbquKospjHyA=