mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-07 00:06:51 -04:00
* feat(crispasr): backend source files (Go gRPC server, C-ABI shim, build files) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * polish(crispasr): brand error strings + fix stale shim comment Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * build(crispasr): register backend in root Makefile Mirror the whisper Go backend registration for the new crispasr backend: NOTPARALLEL entry, prepare-test-extra/test-extra hooks, BACKEND_CRISPASR definition, docker-build target generation, and the docker-build-backends aggregate target. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(crispasr): add backend build matrix entries Mirror the 11 whisper golang Dockerfile matrix entries (CPU amd64/arm64, CUDA 12/13, L4T CUDA 13, Intel SYCL f32/f16, Vulkan amd64/arm64, L4T arm64, ROCm hipblas) with backend and tag-suffix substituted to crispasr. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): add crispasr backend gallery entries Add the crispasr meta anchor and its full set of image gallery entries (cpu, metal, cuda12/13, rocm, intel-sycl f32/f16, vulkan, L4T arm64, L4T cuda13 arm64, plus -development variants), mirroring the whisper backend gallery block. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(crispasr): bump CRISPASR_VERSION via bump_deps workflow Track CrispStrobe/CrispASR main branch and bump CRISPASR_VERSION in backend/go/crispasr/Makefile. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * build(crispasr): don't wire fixture-gated test into test-extra Mirror the whisper Go backend: its AudioTranscription test is gated on model/audio fixtures and skips in CI, so building crispasr (the heaviest ggml compile in the tree) inside the unit-test lane adds a long compile for zero coverage. The backend image build in backend-matrix.yml remains the authoritative compile check. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(crispasr): add darwin metal build entry (mirror whisper) The metal-crispasr gallery entries and capabilities.metal mapping reference -metal-darwin-arm64-crispasr, which is only produced by an includeDarwin entry. Mirror whisper's darwin metal entry so the tag actually gets built. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(crispasr): place hipblas matrix entry next to whisper twin Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(crispasr): register crispasr as pref-only ASR backend + test Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(crispasr): port whisper behavioral suite (cancellation + streaming) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(crispasr): fix skip message env var names to CRISPASR_* Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(crispasr): switch shim to crispasr_session_* multi-architecture API The shim used whisper_full(), which in CrispASR is the whisper-only path: libcrispasr only transcribes Whisper GGUFs through it. Multi-architecture transcription (Parakeet, Voxtral, Qwen3-ASR, Canary, Granite, FunASR, Paraformer, SenseVoice, ...) goes through the crispasr_session_* C-ABI, which auto-detects the architecture from the GGUF and dispatches to the matching backend. Rewrite the C shim around crispasr_session_open / _transcribe_lang / _result_* and add get_backend() so the selected backend is logged. load_model now takes a threads param (session_open binds n_threads at open). The session result is segment+word based with no token IDs and no per-decode callback, so drop n_tokens / get_token_id / get_segment_speaker_turn_next / set_new_segment_callback. set_abort is kept for API parity but is best-effort: the session transcribe is blocking with no abort hook. Update the purego bindings and gocrispasr.go to match: tokens are left empty, speaker-turn handling is removed, and AudioTranscriptionStream emits one delta per non-empty segment after the blocking decode returns (no progressive streaming via the session API), preserving the concat(deltas) == final.Text invariant. crispasr_session_set_translate is exported by libcrispasr but not declared in crispasr.h, so it is forward-declared in the shim alongside the open/transcribe/result functions. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * build(crispasr): link full CrispASR backend set for multi-arch support The shim's crispasr_session_* dispatch calls into the per-architecture backend libs (parakeet, voxtral, qwen3_asr, canary, funasr, paraformer, sensevoice, ...), which CrispASR builds as static archives. Linking only crispasr + ggml dead-stripped every backend object from the final module (nm backend-symbol count: 0), leaving a whisper-only .so. Link the same backend set as crispasr-cli so the static archives are pulled in. After this the module carries the backend symbols (nm count 407, .so grows from ~2.1MB to ~6.7MB) and the session API can dispatch to every compiled-in architecture. Also rewrite ${CMAKE_SOURCE_DIR}/examples/talk-llama to ${PROJECT_SOURCE_DIR}/... in the vendored src/CMakeLists.txt: CrispASR locates its vendored llama.cpp via ${CMAKE_SOURCE_DIR}, which is wrong when CrispASR is add_subdirectory'd (CMAKE_SOURCE_DIR points at this backend dir, not the CrispASR root). PROJECT_SOURCE_DIR is correct both standalone and as a subproject; the sed is idempotent. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(crispasr): adapt suite to session API (blocking, no decode callback) Register the new symbol set (drop the removed token/speaker/callback funcs, add get_backend; load_model now takes 2 args). The session transcribe is blocking with no abort hook, so a mid-decode cancel can't interrupt it: change the cancellation spec to cancel the context before the call and assert codes.Canceled from the pre-call ctx.Err() check, dropping the <5s mid-decode timing assertion. The streaming spec still holds with per-segment post-decode emission (>=2 deltas, concat(deltas) == final.Text). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): add CrispASR ASR model entries (-crispasr) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(gallery): keep only session-auto-detectable CrispASR ASR models The crispasr backend loads models via crispasr_session_open, which auto-detects the backend from the GGUF general.architecture using crispasr_detect_backend_from_gguf. Architectures not in that detect map cannot be opened, so those gallery entries fail to load. Removed entries whose architecture is not wired into CrispASR v0.6.11's session auto-detect router (they can be re-added when upstream maps them): - Not in the detect map: data2vec, firered-asr, funasr, fun-asr-mlt-nano, glm-asr, hubert, kyutai-stt, mega-asr, mimo-asr, moonshine{,-de,-streaming,-tiny-de}, omniasr{,-llm,-llm-1b}, paraformer, sensevoice. - Pending verification (filename-heuristic routed, not arch-detected): parakeet-ctc-0.6b, parakeet-ctc-1.1b. Their GGUFs are routed to the fastconformer-ctc backend by a filename heuristic in the model registry, which implies general.architecture is not a mapped string. Kept the parakeet rnnt/tdt_ctc variants: convert-parakeet-to-gguf.py writes general.architecture="parakeet" unconditionally and encodes the rnnt/ctc distinction in metadata fields, so they session-auto-detect. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(crispasr): TTS synthesis via crispasr_session_synthesize (24kHz) Add tts_synthesize/tts_free/tts_set_voice to the C-ABI shim. They reuse the already-open g_session (crispasr_session_open auto-detects a TTS model) and dispatch to the upstream synthesis call, which returns malloc'd 24 kHz mono float PCM. Orpheus needs a SNAC codec path that we do not set, so it returns NULL here and surfaces as an error Go-side. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(crispasr): implement TTS/TTSStream gRPC methods Bind the new shim functions via purego and implement TTS, TTSStream and a writeWAV24k helper. synthesize copies the C-owned PCM out before freeing it; TTS writes a 24 kHz mono 16-bit WAV to req.Dst via go-audio/wav. CrispASR has no progressive synth, so TTSStream synthesizes fully, encodes to WAV, and emits the bytes as a single chunk; it owns the results-channel close (the gRPC server wrapper ranges until close), mirroring vibevoice-cpp's TTSStream. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(crispasr): log when a TTS voice override is not honored Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): add CrispASR vibevoice-tts model entry Only vibevoice-tts works through the current shim: qwen3-tts, chatterbox, and orpheus require companion codec/s3gen/SNAC paths (set_codec_path / set_s3gen_path) that the shim doesn't wire yet, and kokoro/indextts/voxcpm2 aren't in the session auto-detect map. Those are follow-ups. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(crispasr): gated TTS synthesis spec Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(crispasr): satisfy golangci-lint (errcheck defers + unsafeptr nolint) The crispasr Go file is entirely new, so new-from-merge-base lints every line (unlike the grandfathered whisper backend it was forked from): - handle os.RemoveAll / fh.Close return values in AudioTranscription - annotate the two intentional C-pointer unsafe.Slice sites with //nolint:govet Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(crispasr): backend: and codec: model options (explicit arch + companion files) Add two model-config options to the CrispASR backend via opts.Options: - backend:<name> selects an explicit CrispASR backend (bypassing auto-detect) by routing load_model through crispasr_session_open_explicit, unlocking architectures the detector won't pick on its own (qwen3, cohere, granite, voxtral, moonshine, mimo-asr, orpheus, kokoro, chatterbox, etc.). - codec:<path> loads a companion file (qwen3-tts codec, orpheus SNAC, chatterbox s3gen, or mimo-asr tokenizer) via the universal crispasr_session_set_codec_path setter after the session opens. A relative path resolves against the model directory. rc==0 means success or not-applicable; only a negative rc is fatal. The C shim load_model gains a backend_name argument and a new set_codec_path entry point; the Go bridge parses the prefix:value options and registers the new symbol. The vad_only path is unchanged. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): expand CrispASR models via backend:/codec: options (explicit arch + companions) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(gallery): use virtual.yaml base for crispasr models The crispasr entries are just backend + model + a couple options, fully expressed inline via overrides:/files: in gallery/index.yaml. Point each url: at the shared gallery/virtual.yaml (the established 'virtual' model trick) and drop the 36 redundant per-model gallery/*-crispasr.yaml files. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(gallery): drop voice-requiring TTS entries (keep vibevoice-tts) Real e2e showed qwen3-tts/orpheus/chatterbox don't synthesize through the current shim: the codec: companion loads fine, but these engines additionally need a voice pack / voice prompt / reference clip (qwen3-tts base errors 'no voice'; chatterbox is zero-shot cloning; orpheus uses named voices) that the backend doesn't wire. (qwen3-tts also can't auto-detect: its GGUF arch is 'qwen3tts', unmapped by the detector — would need backend:qwen3-tts.) Removed to avoid shipping non-working gallery entries; vibevoice-tts (built-in voice, e2e-verified) remains the working TTS. Voice-pack wiring is a follow-up. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(crispasr): speaker: and voice: TTS options (baked speakers + voice packs/prompts) speaker:<name> -> crispasr_session_set_speaker_name (baked speakers: qwen3-tts CustomVoice, orpheus). voice:<path>(+voice_text:<ref>) -> crispasr_session_set_voice (voice-pack GGUF, or WAV zero-shot clone with ref text). Applied at Load as the default voice; req.Voice still overrides the speaker per request. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): re-add e2e-verified TTS engines (chatterbox, qwen3-tts-customvoice, orpheus) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
412 lines
15 KiB
Go
412 lines
15 KiB
Go
package localai
|
|
|
|
import (
|
|
"encoding/json"
|
|
"fmt"
|
|
"sort"
|
|
|
|
"github.com/google/uuid"
|
|
"github.com/labstack/echo/v4"
|
|
"github.com/mudler/LocalAI/core/config"
|
|
"github.com/mudler/LocalAI/core/gallery"
|
|
"github.com/mudler/LocalAI/core/gallery/importers"
|
|
"github.com/mudler/LocalAI/core/http/middleware"
|
|
"github.com/mudler/LocalAI/core/schema"
|
|
"github.com/mudler/LocalAI/core/services/galleryop"
|
|
"github.com/mudler/LocalAI/pkg/system"
|
|
"github.com/mudler/xlog"
|
|
)
|
|
|
|
// knownPrefOnlyBackends lists backends that have no dedicated importer
|
|
// and no importer-hosted drop-in entry, but users may still pick them
|
|
// via the preference-only path. Edit this slice to add new pref-only
|
|
// backends that should appear in the import form dropdown.
|
|
var knownPrefOnlyBackends = []schema.KnownBackend{
|
|
// Text LLM
|
|
// ds4: antirez/ds4 - single-model DeepSeek V4 Flash engine; auto-detected via DS4Importer
|
|
{Name: "ds4", Modality: "text", AutoDetect: false, Description: "antirez/ds4 DeepSeek V4 Flash engine (auto-detected; pref-only fallback)"},
|
|
{Name: "sglang", Modality: "text", AutoDetect: false, Description: "SGLang runtime (preference-only)"},
|
|
{Name: "tinygrad", Modality: "text", AutoDetect: false, Description: "tinygrad runtime (preference-only)"},
|
|
{Name: "trl", Modality: "text", AutoDetect: false, Description: "Transformers Reinforcement Learning (preference-only)"},
|
|
{Name: "mlx-vlm", Modality: "text", AutoDetect: false, Description: "MLX vision-language models (preference-only)"},
|
|
// ASR
|
|
{Name: "whisperx", Modality: "asr", AutoDetect: false, Description: "WhisperX transcription (preference-only)"},
|
|
{Name: "crispasr", Modality: "asr", AutoDetect: false, Description: "CrispASR multi-architecture transcription (preference-only)"},
|
|
// TTS
|
|
{Name: "kokoros", Modality: "tts", AutoDetect: false, Description: "Kokoros TTS (preference-only)"},
|
|
{Name: "qwen-tts", Modality: "tts", AutoDetect: false, Description: "Qwen TTS (preference-only)"},
|
|
{Name: "qwen3-tts-cpp", Modality: "tts", AutoDetect: false, Description: "Qwen3 TTS C++ (preference-only)"},
|
|
{Name: "faster-qwen3-tts", Modality: "tts", AutoDetect: false, Description: "Faster Qwen3 TTS (preference-only)"},
|
|
// Detection
|
|
{Name: "sam3-cpp", Modality: "detection", AutoDetect: false, Description: "SAM3 C++ object detection (preference-only)"},
|
|
// Audio transform (audio-in / audio-out, optional reference signal)
|
|
{Name: "localvqe", Modality: "audio-transform", AutoDetect: false, Description: "LocalVQE C++ joint AEC + noise suppression + dereverberation (preference-only)"},
|
|
}
|
|
|
|
// UpgradeInfoProvider is an interface for querying cached backend upgrade information.
|
|
type UpgradeInfoProvider interface {
|
|
GetAvailableUpgrades() map[string]gallery.UpgradeInfo
|
|
TriggerCheck()
|
|
}
|
|
|
|
type BackendEndpointService struct {
|
|
galleries []config.Gallery
|
|
backendPath string
|
|
backendSystemPath string
|
|
backendApplier *galleryop.GalleryService
|
|
upgradeChecker UpgradeInfoProvider
|
|
}
|
|
|
|
type GalleryBackend struct {
|
|
ID string `json:"id"`
|
|
}
|
|
|
|
func CreateBackendEndpointService(galleries []config.Gallery, systemState *system.SystemState, backendApplier *galleryop.GalleryService, upgradeChecker UpgradeInfoProvider) BackendEndpointService {
|
|
return BackendEndpointService{
|
|
galleries: galleries,
|
|
backendPath: systemState.Backend.BackendsPath,
|
|
backendSystemPath: systemState.Backend.BackendsSystemPath,
|
|
backendApplier: backendApplier,
|
|
upgradeChecker: upgradeChecker,
|
|
}
|
|
}
|
|
|
|
// GetOpStatusEndpoint returns the job status
|
|
// @Summary Returns the job status
|
|
// @Tags backends
|
|
// @Success 200 {object} galleryop.OpStatus "Response"
|
|
// @Router /backends/jobs/{uuid} [get]
|
|
func (mgs *BackendEndpointService) GetOpStatusEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
status := mgs.backendApplier.GetStatus(c.Param("uuid"))
|
|
if status == nil {
|
|
return fmt.Errorf("could not find any status for ID")
|
|
}
|
|
return c.JSON(200, status)
|
|
}
|
|
}
|
|
|
|
// GetAllStatusEndpoint returns all the jobs status progress
|
|
// @Summary Returns all the jobs status progress
|
|
// @Tags backends
|
|
// @Success 200 {object} map[string]galleryop.OpStatus "Response"
|
|
// @Router /backends/jobs [get]
|
|
func (mgs *BackendEndpointService) GetAllStatusEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
return c.JSON(200, mgs.backendApplier.GetAllStatus())
|
|
}
|
|
}
|
|
|
|
// ApplyBackendEndpoint installs a new backend to a LocalAI instance
|
|
// @Summary Install backends to LocalAI.
|
|
// @Tags backends
|
|
// @Param request body GalleryBackend true "query params"
|
|
// @Success 200 {object} schema.BackendResponse "Response"
|
|
// @Router /backends/apply [post]
|
|
func (mgs *BackendEndpointService) ApplyBackendEndpoint(systemState *system.SystemState) echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
input := new(GalleryBackend)
|
|
// Get input data from the request body
|
|
if err := c.Bind(input); err != nil {
|
|
return err
|
|
}
|
|
|
|
// In distributed mode, refuse to fan out a hardware-specific build to
|
|
// every node — a CPU build landing on a GPU cluster is almost always
|
|
// wrong, and the silent footgun is exactly what this guard exists for.
|
|
// Auto-resolving (meta) backends are fine because each node picks its
|
|
// own variant. Tooling can recover by hitting
|
|
// POST /api/nodes/{id}/backends/install per target node.
|
|
if mgs.backendApplier.BackendManager().IsDistributed() && input.ID != "" {
|
|
if guard := concreteFanOutGuard(c, mgs.galleries, systemState, input.ID); guard != nil {
|
|
return guard
|
|
}
|
|
}
|
|
|
|
uuid, err := uuid.NewUUID()
|
|
if err != nil {
|
|
return err
|
|
}
|
|
mgs.backendApplier.BackendGalleryChannel <- galleryop.ManagementOp[gallery.GalleryBackend, any]{
|
|
ID: uuid.String(),
|
|
GalleryElementName: input.ID,
|
|
Galleries: mgs.galleries,
|
|
}
|
|
|
|
return c.JSON(200, schema.BackendResponse{ID: uuid.String(), StatusURL: fmt.Sprintf("%sbackends/jobs/%s", middleware.BaseURL(c), uuid.String())})
|
|
}
|
|
}
|
|
|
|
// concreteFanOutGuard returns a 409 response if the requested backend is a
|
|
// hardware-specific build (not auto-resolving / meta) and we are in
|
|
// distributed mode. It looks up the backend in the configured galleries; if
|
|
// the lookup itself fails (gallery unreachable, name not found), the guard
|
|
// stays out of the way and lets the install enqueue normally — a missing
|
|
// name will surface from the worker as a clearer error than the guard could
|
|
// produce here. The response body deliberately speaks human, with `code` and
|
|
// `meta_alternative` as the programmatic contract for tooling.
|
|
func concreteFanOutGuard(c echo.Context, galleries []config.Gallery, systemState *system.SystemState, backendID string) error {
|
|
// Use the unfiltered listing because in distributed mode the frontend's
|
|
// hardware is irrelevant — the install targets workers, not us — and the
|
|
// filtered list would hide variants that don't match the frontend host
|
|
// (e.g. a CUDA build on a CPU-only frontend), preventing the guard from
|
|
// firing for exactly the cases it's meant to protect against.
|
|
available, err := gallery.AvailableBackendsUnfiltered(galleries, systemState)
|
|
if err != nil {
|
|
return nil
|
|
}
|
|
requested := available.FindByName(backendID)
|
|
if requested == nil || requested.IsMeta() {
|
|
return nil
|
|
}
|
|
|
|
// Try to find an auto-resolving (meta) backend that has this concrete
|
|
// variant in its CapabilitiesMap, so we can suggest it as a one-shot
|
|
// alternative. Optional — empty string is fine if no parent exists.
|
|
metaAlternative := ""
|
|
for _, b := range available {
|
|
if !b.IsMeta() {
|
|
continue
|
|
}
|
|
for _, concrete := range b.CapabilitiesMap {
|
|
if concrete == backendID {
|
|
metaAlternative = b.Name
|
|
break
|
|
}
|
|
}
|
|
if metaAlternative != "" {
|
|
break
|
|
}
|
|
}
|
|
|
|
msg := fmt.Sprintf(
|
|
"Backend %q is a hardware-specific build and won't run correctly on every node in this cluster. In distributed mode, install it on specific nodes:\n\n POST /api/nodes/{node_id}/backends/install\n {\"backend\": %q}",
|
|
backendID, backendID,
|
|
)
|
|
if metaAlternative != "" {
|
|
msg += fmt.Sprintf(
|
|
"\n\nTo install across all nodes, use the auto-resolving backend %q — each node picks its own variant based on its hardware.",
|
|
metaAlternative,
|
|
)
|
|
}
|
|
|
|
return c.JSON(409, map[string]any{
|
|
"error": msg,
|
|
"code": "concrete_backend_requires_target",
|
|
"meta_alternative": metaAlternative,
|
|
})
|
|
}
|
|
|
|
// DeleteBackendEndpoint lets delete backends from a LocalAI instance
|
|
// @Summary delete backends from LocalAI.
|
|
// @Tags backends
|
|
// @Param name path string true "Backend name"
|
|
// @Success 200 {object} schema.BackendResponse "Response"
|
|
// @Router /backends/delete/{name} [post]
|
|
func (mgs *BackendEndpointService) DeleteBackendEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
backendName := c.Param("name")
|
|
|
|
mgs.backendApplier.BackendGalleryChannel <- galleryop.ManagementOp[gallery.GalleryBackend, any]{
|
|
Delete: true,
|
|
GalleryElementName: backendName,
|
|
Galleries: mgs.galleries,
|
|
}
|
|
|
|
uuid, err := uuid.NewUUID()
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
return c.JSON(200, schema.BackendResponse{ID: uuid.String(), StatusURL: fmt.Sprintf("%sbackends/jobs/%s", middleware.BaseURL(c), uuid.String())})
|
|
}
|
|
}
|
|
|
|
// ListBackendsEndpoint list the available backends configured in LocalAI
|
|
// @Summary List all Backends
|
|
// @Tags backends
|
|
// @Success 200 {object} []gallery.GalleryBackend "Response"
|
|
// @Router /backends [get]
|
|
func (mgs *BackendEndpointService) ListBackendsEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
backends, err := mgs.backendApplier.ListBackends()
|
|
if err != nil {
|
|
return err
|
|
}
|
|
return c.JSON(200, backends.GetAll())
|
|
}
|
|
}
|
|
|
|
// ListModelGalleriesEndpoint list the available galleries configured in LocalAI
|
|
// @Summary List all Galleries
|
|
// @Tags backends
|
|
// @Success 200 {object} []config.Gallery "Response"
|
|
// @Router /backends/galleries [get]
|
|
// NOTE: This is different (and much simpler!) than above! This JUST lists the model galleries that have been loaded, not their contents!
|
|
func (mgs *BackendEndpointService) ListBackendGalleriesEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
xlog.Debug("Listing backend galleries", "galleries", mgs.galleries)
|
|
dat, err := json.Marshal(mgs.galleries)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
return c.Blob(200, "application/json", dat)
|
|
}
|
|
}
|
|
|
|
// GetUpgradesEndpoint returns the cached backend upgrade information
|
|
// @Summary Get available backend upgrades
|
|
// @Tags backends
|
|
// @Success 200 {object} map[string]gallery.UpgradeInfo "Response"
|
|
// @Router /backends/upgrades [get]
|
|
func (mgs *BackendEndpointService) GetUpgradesEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
if mgs.upgradeChecker == nil {
|
|
return c.JSON(200, map[string]gallery.UpgradeInfo{})
|
|
}
|
|
return c.JSON(200, mgs.upgradeChecker.GetAvailableUpgrades())
|
|
}
|
|
}
|
|
|
|
// CheckUpgradesEndpoint forces an immediate upgrade check
|
|
// @Summary Force backend upgrade check
|
|
// @Tags backends
|
|
// @Success 200 {object} map[string]gallery.UpgradeInfo "Response"
|
|
// @Router /backends/upgrades/check [post]
|
|
func (mgs *BackendEndpointService) CheckUpgradesEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
if mgs.upgradeChecker == nil {
|
|
return c.JSON(200, map[string]gallery.UpgradeInfo{})
|
|
}
|
|
mgs.upgradeChecker.TriggerCheck()
|
|
// Return current cached results (the triggered check runs async)
|
|
return c.JSON(200, mgs.upgradeChecker.GetAvailableUpgrades())
|
|
}
|
|
}
|
|
|
|
// UpgradeBackendEndpoint triggers an upgrade for a specific backend
|
|
// @Summary Upgrade a backend
|
|
// @Tags backends
|
|
// @Param name path string true "Backend name"
|
|
// @Success 200 {object} schema.BackendResponse "Response"
|
|
// @Router /backends/upgrade/{name} [post]
|
|
func (mgs *BackendEndpointService) UpgradeBackendEndpoint() echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
backendName := c.Param("name")
|
|
|
|
uuid, err := uuid.NewUUID()
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
mgs.backendApplier.BackendGalleryChannel <- galleryop.ManagementOp[gallery.GalleryBackend, any]{
|
|
ID: uuid.String(),
|
|
GalleryElementName: backendName,
|
|
Galleries: mgs.galleries,
|
|
Upgrade: true,
|
|
}
|
|
|
|
return c.JSON(200, schema.BackendResponse{ID: uuid.String(), StatusURL: fmt.Sprintf("%sbackends/jobs/%s", middleware.BaseURL(c), uuid.String())})
|
|
}
|
|
}
|
|
|
|
// ListAvailableBackendsEndpoint list the available backends in the galleries configured in LocalAI
|
|
// @Summary List all available Backends
|
|
// @Tags backends
|
|
// @Success 200 {object} []gallery.GalleryBackend "Response"
|
|
// @Router /backends/available [get]
|
|
func (mgs *BackendEndpointService) ListAvailableBackendsEndpoint(systemState *system.SystemState) echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
backends, err := gallery.AvailableBackends(mgs.galleries, systemState)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
return c.JSON(200, backends)
|
|
}
|
|
}
|
|
|
|
// ListKnownBackendsEndpoint returns every backend the import system is
|
|
// aware of, regardless of install state or host compatibility. This is
|
|
// the source of truth for the import form dropdown — users may pick a
|
|
// backend that is not yet installed so LocalAI can auto-install it.
|
|
// @Summary List all known Backends (importer registry + curated pref-only + installed-on-disk)
|
|
// @Tags backends
|
|
// @Success 200 {object} []schema.KnownBackend "Response"
|
|
// @Router /backends/known [get]
|
|
func (mgs *BackendEndpointService) ListKnownBackendsEndpoint(systemState *system.SystemState) echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
// byName dedupes entries while preserving "importer wins over
|
|
// pref-only" priority. Insertion order: importers → drop-ins →
|
|
// pref-only → installed-on-disk.
|
|
byName := make(map[string]schema.KnownBackend)
|
|
|
|
for _, imp := range importers.Registry() {
|
|
byName[imp.Name()] = schema.KnownBackend{
|
|
Name: imp.Name(),
|
|
Modality: imp.Modality(),
|
|
AutoDetect: imp.AutoDetects(),
|
|
}
|
|
|
|
if host, ok := imp.(importers.AdditionalBackendsProvider); ok {
|
|
for _, extra := range host.AdditionalBackends() {
|
|
if _, exists := byName[extra.Name]; exists {
|
|
continue
|
|
}
|
|
byName[extra.Name] = schema.KnownBackend{
|
|
Name: extra.Name,
|
|
Modality: extra.Modality,
|
|
AutoDetect: false,
|
|
Description: extra.Description,
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
for _, pref := range knownPrefOnlyBackends {
|
|
if _, exists := byName[pref.Name]; exists {
|
|
continue
|
|
}
|
|
byName[pref.Name] = pref
|
|
}
|
|
|
|
// Surface backends installed on this host and flag them as such.
|
|
// Importer/pref-only entries that are also on disk get Installed=true
|
|
// while keeping their metadata. System-only backends join the map
|
|
// with empty Modality (we can't classify them) and AutoDetect=false
|
|
// because they require an explicit preference.
|
|
if systemState != nil {
|
|
installed, err := gallery.ListSystemBackends(systemState)
|
|
if err != nil {
|
|
xlog.Debug("ListKnownBackendsEndpoint: failed to list installed backends", "error", err)
|
|
} else {
|
|
for name := range installed {
|
|
if entry, exists := byName[name]; exists {
|
|
entry.Installed = true
|
|
byName[name] = entry
|
|
continue
|
|
}
|
|
byName[name] = schema.KnownBackend{
|
|
Name: name,
|
|
Modality: "",
|
|
AutoDetect: false,
|
|
Installed: true,
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
out := make([]schema.KnownBackend, 0, len(byName))
|
|
for _, b := range byName {
|
|
out = append(out, b)
|
|
}
|
|
sort.SliceStable(out, func(i, j int) bool {
|
|
if out[i].Modality != out[j].Modality {
|
|
return out[i].Modality < out[j].Modality
|
|
}
|
|
return out[i].Name < out[j].Name
|
|
})
|
|
|
|
return c.JSON(200, out)
|
|
}
|
|
}
|