LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-22 15:49:12 -04:00

Author	SHA1	Message	Date
Ettore Di Giacinto	9b973b79f6	feat: add VoxCPM tts backend (#8109 ) * feat: add VoxCPM tts backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Disable voxcpm on arm64 cpu Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-28 14:44:04 +01:00
Ettore Di Giacinto	ec1598868b	feat(vibevoice): add ASR support (#8222 ) * feat(vibevoice): add ASR support Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): download voice files Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Try to run on bigger runner Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * debug Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * CI can't hold vibevoice Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-27 20:19:22 +01:00
Ettore Di Giacinto	26a374b717	chore: drop bark which is unmaintained (#8207 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-25 09:26:40 +01:00
Ettore Di Giacinto	b2a8a63899	feat(vllm-omni): add new backend (#8188 ) * feat(vllm-omni: add new backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * default to py3.12 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-24 22:23:30 +01:00
Ettore Di Giacinto	05904c77f5	chore(exllama): drop backend now almost deprecated (#8186 ) exllama2 development has stalled and only old architectures are supported. exllamav3 is still in development, meanwhile cleaning up exllama2 from the gallery. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-24 08:57:37 +01:00
Ettore Di Giacinto	58bb6a29ed	Revert "chore(deps): bump torch from 2.4.1 to 2.7.1+xpu in /backend/python/bark in the pip group across 1 directory" (#8180 ) Revert "chore(deps): bump torch from 2.4.1 to 2.7.1+xpu in /backend/python/ba…" This reverts commit `5881c82413`.	2026-01-23 17:25:04 +01:00
dependabot[bot]	5881c82413	chore(deps): bump torch from 2.4.1 to 2.7.1+xpu in /backend/python/bark in the pip group across 1 directory (#8175 ) chore(deps): bump torch Bumps the pip group with 1 update in the /backend/python/bark directory: torch. Updates `torch` from 2.4.1 to 2.7.1+xpu --- updated-dependencies: - dependency-name: torch dependency-version: 2.7.1+xpu dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-01-23 15:32:15 +00:00
Ettore Di Giacinto	923ebbb344	feat(qwen-tts): add Qwen-tts backend (#8163 ) * feat(qwen-tts): add Qwen-tts backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update intel deps Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop flash-attn for cuda13 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-23 15:18:41 +01:00
Ettore Di Giacinto	0fa0ac4797	fix(videogen): drop incomplete endpoint, add GGUF support for LTX-2 (#8160 ) * Debug Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop openai video endpoint (is not complete) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add download button Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-22 14:09:20 +01:00
Ettore Di Giacinto	22c0eb5421	chore(diffusers): add 'av' to requirements.txt (#8155 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-01-21 22:35:00 +01:00
Ettore Di Giacinto	d16722ee13	Revert "chore(deps): bump torch from 2.3.1+cxx11.abi to 2.8.0 in /backend/python/rerankers in the pip group across 1 directory" (#8072 ) Revert "chore(deps): bump torch from 2.3.1+cxx11.abi to 2.8.0 in /backend/pyt…" This reverts commit `1f10ab39a9`.	2026-01-16 20:50:33 +01:00
dependabot[bot]	1f10ab39a9	chore(deps): bump torch from 2.3.1+cxx11.abi to 2.8.0 in /backend/python/rerankers in the pip group across 1 directory (#8066 ) chore(deps): bump torch Bumps the pip group with 1 update in the /backend/python/rerankers directory: [torch](https://github.com/pytorch/pytorch). Updates `torch` from 2.3.1+cxx11.abi to 2.8.0 - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/commits/v2.8.0) --- updated-dependencies: - dependency-name: torch dependency-version: 2.8.0 dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-01-16 19:38:12 +00:00
Ettore Di Giacinto	b19afc9e64	feat(diffusers): add support to LTX-2 (#8019 ) * feat(diffusers): add support to LTX-2 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add to the gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-14 09:07:30 +01:00
Ettore Di Giacinto	a6ff354c86	feat(tts): add pocket-tts backend (#8018 ) * feat(pocket-tts): add new backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add to the gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-13 23:35:19 +01:00
dependabot[bot]	94eecc43a3	chore(deps): bump protobuf from 6.33.2 to 6.33.4 in /backend/python/transformers (#7993 ) chore(deps): bump protobuf in /backend/python/transformers Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 6.33.2 to 6.33.4. - [Release notes](https://github.com/protocolbuffers/protobuf/releases) - [Commits](https://github.com/protocolbuffers/protobuf/commits) --- updated-dependencies: - dependency-name: protobuf dependency-version: 6.33.4 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-01-12 23:46:32 +00:00
Ettore Di Giacinto	2de30440fe	fix(l4t-12): use pip to install python deps (#7967 ) * fix: install only torch/torchvision from jetson index Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: use pip for l4t-12 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Revert "fix: install only torch/torchvision from jetson index" This reverts commit `2d2b020078` * chatterbox needs wheel Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-11 00:21:32 +01:00
Ettore Di Giacinto	a4d224dd1b	Revert "chore(uv): add --index-strategy=unsafe-first-match to l4t" (#7936 ) Revert "chore(uv): add --index-strategy=unsafe-first-match to l4t (#7934)" This reverts commit `f5dee90962`.	2026-01-08 23:31:51 +01:00
Ettore Di Giacinto	f5dee90962	chore(uv): add --index-strategy=unsafe-first-match to l4t (#7934 ) This is because the main index might not contain all the dependencies for torch Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-08 22:48:03 +01:00
Ettore Di Giacinto	383312b50e	chore(l4t-12): do not use python 3.12 (wheels are only for 3.10) (#7928 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-08 19:00:07 +01:00
Ettore Di Giacinto	b964b3d53e	feat(backends): add moonshine backend for faster transcription (#7833 ) * feat(backends): add moonshine backend for faster transcription Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add backend to CI, update AGENTS.md from this exercise Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-07 21:44:35 +01:00
Copilot	fd53978a7b	feat: package GPU libraries inside backend containers for unified base image (#7891 ) * Initial plan * Add GPU library packaging for isolated backend environments - Create scripts/build/package-gpu-libs.sh for packaging CUDA, ROCm, SYCL, and Vulkan libraries - Update llama-cpp, whisper, stablediffusion-ggml package.sh to include GPU libraries - Update Dockerfile.python to package GPU libraries into Python backends - Update libbackend.sh to set LD_LIBRARY_PATH for GPU library loading Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> * Address code review feedback: fix variable consistency and quoting Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> * Fix code review issues: improve glob handling and remove redundant variable Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> * Simplify main Dockerfile and workflow to use unified base image - Remove GPU-specific driver installation from Dockerfile (CUDA, ROCm, Vulkan, Intel) - Simplify image.yml workflow to build single unified base image for linux/amd64 and linux/arm64 - GPU libraries are now packaged in individual backend containers Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-01-07 15:48:51 +01:00
Richard Palethorpe	e6ba26c3e7	chore: Update to Ubuntu24.04 (cont #7423 ) (#7769 ) * ci(workflows): bump GitHub Actions images to Ubuntu 24.04 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): remove CUDA 11.x support from GitHub Actions (incompatible with ubuntu:24.04) Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): bump GitHub Actions CUDA support to 12.9 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): bump base image to ubuntu:24.04 and adjust Vulkan SDK/packages Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix(backend): correct context paths for Python backends in workflows, Makefile and Dockerfile Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): disable parallel backend builds to avoid race conditions Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): export CUDA_MAJOR_VERSION and CUDA_MINOR_VERSION for override Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(backend): update backend Dockerfiles to Ubuntu 24.04 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(backend): add ROCm env vars and default AMDGPU_TARGETS for hipBLAS builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(chatterbox): bump ROCm PyTorch to 2.9.1+rocm6.4 and update index URL; align hipblas requirements Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore: add local-ai-launcher to .gitignore Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): fix backends GitHub Actions workflows after rebase Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): use build-time UBUNTU_VERSION variable Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(docker): remove libquadmath0 from requirements-stage base image Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): add backends/vllm to .NOTPARALLEL to prevent parallel builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix(docker): correct CUDA installation steps in backend Dockerfiles Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(backend): update ROCm to 6.4 and align Python hipblas requirements Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for CUDA on arm64 builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): update base image and backend Dockerfiles for Ubuntu 24.04 compatibility on arm64 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(backend): increase timeout for uv installs behind slow networks on backend/Dockerfile.python Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for vibevoice backend Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): fix failing GitHub Actions runners Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix: Allow FROM_SOURCE to be unset, use upstream Intel images etc. Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(build): rm all traces of CUDA 11 Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(build): Add Ubuntu codename as an argument Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>	2026-01-06 15:26:42 +01:00
blightbow	67baf66555	feat(mlx): add thread-safe LRU prompt cache and min_p/top_k sampling (#7556 ) * feat(mlx): add thread-safe LRU prompt cache Port mlx-lm's LRUPromptCache to fix race condition where concurrent requests corrupt shared KV cache state. The previous implementation used a single prompt_cache instance shared across all requests. Changes: - Add backend/python/common/mlx_cache.py with ThreadSafeLRUPromptCache - Modify backend.py to use per-request cache isolation via fetch/insert - Add prefix matching for cache reuse across similar prompts - Add LRU eviction (default 10 entries, configurable) - Add concurrency and cache unit tests The cache uses a trie-based structure for efficient prefix matching, allowing prompts that share common prefixes to reuse cached KV states. Thread safety is provided via threading.Lock. New configuration options: - max_cache_entries: Maximum LRU cache entries (default: 10) - max_kv_size: Maximum KV cache size per entry (default: None) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Blightbow <blightbow@users.noreply.github.com> * feat(mlx): add min_p and top_k sampler support Add MinP field to proto (field 52) following the precedent set by other non-OpenAI sampling parameters like TopK, TailFreeSamplingZ, TypicalP, and Mirostat. Changes: - backend.proto: Add float MinP field for min-p sampling - backend.py: Extract and pass min_p and top_k to mlx_lm sampler (top_k was in proto but not being passed) - test.py: Fix test_sampling_params to use valid proto fields and switch to MLX-compatible model (mlx-community/Llama-3.2-1B-Instruct) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Blightbow <blightbow@users.noreply.github.com> * refactor(mlx): move mlx_cache.py from common to mlx backend The ThreadSafeLRUPromptCache is only used by the mlx backend. After evaluating mlx-vlm, it was determined that the cache cannot be shared because mlx-vlm's generate/stream_generate functions don't support the prompt_cache parameter that mlx_lm provides. - Move mlx_cache.py from backend/python/common/ to backend/python/mlx/ - Remove sys.path manipulation from backend.py and test.py - Fix test assertion to expect "MLX model loaded successfully" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Blightbow <blightbow@users.noreply.github.com> * test(mlx): add comprehensive cache tests and document upstream behavior Added comprehensive unit tests (test_mlx_cache.py) covering all cache operation modes: - Exact match - Shorter prefix match - Longer prefix match with trimming - No match scenarios - LRU eviction and access order - Reference counting and deep copy behavior - Multi-model namespacing - Thread safety with data integrity verification Documents upstream mlx_lm/server.py behavior: single-token prefixes are deliberately not matched (uses > 0, not >= 0) to allow longer cached sequences to be preferred for trimming. This is acceptable because real prompts with chat templates are always many tokens. Removed weak unit tests from test.py that only verified "no exception thrown" rather than correctness. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Blightbow <blightbow@users.noreply.github.com> * chore(mlx): remove unused MinP proto field The MinP field was added to PredictOptions but is not populated by the Go frontend/API. The MLX backend uses getattr with a default value, so it works without the proto field. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Blightbow <blightbow@users.noreply.github.com> --------- Signed-off-by: Blightbow <blightbow@users.noreply.github.com> Co-authored-by: Blightbow <blightbow@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-16 11:27:46 +01:00
dependabot[bot]	dbd25885c3	chore(deps): bump sentence-transformers from 5.1.0 to 5.2.0 in /backend/python/transformers (#7594 ) chore(deps): bump sentence-transformers in /backend/python/transformers Bumps [sentence-transformers](https://github.com/huggingface/sentence-transformers) from 5.1.0 to 5.2.0. - [Release notes](https://github.com/huggingface/sentence-transformers/releases) - [Commits](https://github.com/huggingface/sentence-transformers/compare/v5.1.0...v5.2.0) --- updated-dependencies: - dependency-name: sentence-transformers dependency-version: 5.2.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-16 09:12:57 +01:00
Ettore Di Giacinto	7790a24682	Revert "chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend/python/diffusers in the pip group across 1 directory" (#7558 ) Revert "chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend…" This reverts commit `1b4aa6f1be`.	2025-12-13 17:04:46 +01:00
dependabot[bot]	1b4aa6f1be	chore(deps): bump torch from 2.5.1+cxx11.abi to 2.7.1+cpu in /backend/python/diffusers in the pip group across 1 directory (#7549 ) chore(deps): bump torch Bumps the pip group with 1 update in the /backend/python/diffusers directory: torch. Updates `torch` from 2.5.1+cxx11.abi to 2.7.1+cpu --- updated-dependencies: - dependency-name: torch dependency-version: 2.7.1+cpu dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-13 13:12:18 +00:00
Ettore Di Giacinto	504d954aea	Add chardet to requirements-l4t13.txt Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-12-13 12:59:03 +01:00
Ettore Di Giacinto	6d2a535813	chore(l4t13): use pytorch index (#7546 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-13 10:04:57 +01:00
Ettore Di Giacinto	32dcb58e89	feat(vibevoice): add new backend (#7494 ) * feat(vibevoice): add backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: add workflow and backend index Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(gallery): add vibevoice Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use self-hosted for intel builds Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Pin python version for l4t Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-10 21:14:21 +01:00
dependabot[bot]	bbce461f57	chore(deps): bump protobuf from 6.33.1 to 6.33.2 in /backend/python/transformers (#7481 ) chore(deps): bump protobuf in /backend/python/transformers Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 6.33.1 to 6.33.2. - [Release notes](https://github.com/protocolbuffers/protobuf/releases) - [Commits](https://github.com/protocolbuffers/protobuf/commits) --- updated-dependencies: - dependency-name: protobuf dependency-version: 6.33.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 22:13:18 +01:00
Copilot	1abbedd732	feat(diffusers): implement dynamic pipeline loader to remove per-pipeline conditionals (#7365 ) * Initial plan Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add dynamic loader for diffusers pipelines and refactor backend.py Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix pipeline discovery error handling and test mock issue Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Address code review feedback: direct imports, better error handling, improved tests Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Address remaining code review feedback: specific exceptions, registry access, test imports Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add defensive fallback for DiffusionPipeline registry access Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Actually use dynamic pipeline loading for all pipelines in backend Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use dynamic loader consistently for all pipelines including AutoPipelineForText2Image Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Move dynamic loader tests into test.py for CI compatibility Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Extend dynamic loader to discover any diffusers class type, not just DiffusionPipeline Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add AutoPipeline classes to pipeline registry for default model loading Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(python): set pyvenv python home Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * do pyenv update during start Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Minor changes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-12-04 19:02:06 +01:00
Ettore Di Giacinto	cfd95745ed	feat: add cuda13 images (#7404 ) * chore(ci): add cuda13 jobs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add to pipelines and to capabilities. Start to work on the gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * capabilities: try to detect by looking at /usr/local Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * neutts Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * backends.yaml Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add cuda13 l4t requirements.txt Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add cuda13 requirements.txt Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Pin vllm Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Not all backends are compatible Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add vllm to requirements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * vllm is not pre-compiled for cuda 13 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-02 14:24:35 +01:00
Ettore Di Giacinto	4b5977f535	chore: drop pinning of python 3.12 (#7389 ) Update install.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-28 11:02:56 +01:00
Ettore Di Giacinto	0d877b1e71	Revert "chore(l4t): Update extra index URL for requirements-l4t.txt" (#7388 ) Revert "chore(l4t): Update extra index URL for requirements-l4t.txt (#7383)" This reverts commit `0d781e6b7e`.	2025-11-28 11:02:11 +01:00
Ettore Di Giacinto	e27f1370eb	chore(diffusers): Add PY_STANDALONE_TAG for l4t Python version (#7387 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-28 09:34:05 +01:00
Ettore Di Giacinto	e01d821314	chore: Add Python 3.12 support for l4t build profile (#7384 ) Set Python version to 3.12 for l4t build profile. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-27 23:00:09 +01:00
Ettore Di Giacinto	0d781e6b7e	chore(l4t): Update extra index URL for requirements-l4t.txt (#7383 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-27 22:02:06 +01:00
Ettore Di Giacinto	7ccc383a8b	chore(l4t/diffusers): bump nvidia l4t index for pytorch 2.9 (#7379 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-27 17:42:01 +01:00
Ettore Di Giacinto	2f8a2b1297	chore(deps): update diffusers dependency to use GitHub repo for l4t (#7369 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-27 16:02:48 +01:00
dependabot[bot]	7e01aa8faa	chore(deps): bump protobuf from 6.32.0 to 6.33.1 in /backend/python/transformers (#7340 ) chore(deps): bump protobuf in /backend/python/transformers Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 6.32.0 to 6.33.1. - [Release notes](https://github.com/protocolbuffers/protobuf/releases) - [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/protobuf_release.bzl) - [Commits](https://github.com/protocolbuffers/protobuf/commits) --- updated-dependencies: - dependency-name: protobuf dependency-version: 6.33.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-24 20:12:17 +00:00
Ettore Di Giacinto	3a232446e0	Revert "chore(chatterbox): bump l4t index to support more recent pytorch" (#7333 ) Revert "chore(chatterbox): bump l4t index to support more recent pytorch (#7332)" This reverts commit `55607a5aac`.	2025-11-22 10:10:27 +01:00
Ettore Di Giacinto	55607a5aac	chore(chatterbox): bump l4t index to support more recent pytorch (#7332 ) This should add support for devices like the DGX Spark Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-21 22:24:46 +01:00
Ettore Di Giacinto	ec492a4c56	fix(typo): environment variable name for max jobs Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-21 18:37:22 +01:00
Ettore Di Giacinto	2defe98df8	fix(vllm): Update flash-attn to specific wheel URL Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-21 18:06:46 +01:00
Ettore Di Giacinto	6261c87b1b	Add NVCC_THREADS and MAX_JOB environment variables Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-21 16:14:13 +01:00
Ettore Di Giacinto	daf39e1efd	chore(vllm/ci): set maximum number of jobs Also added comments to clarify CPU usage during build. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-11-20 15:53:32 +01:00
Mikhail Khludnev	01cd58a739	fix(reranker): support omitting top_n (#7199 ) * fix(reranker): support omitting top_n Signed-off-by: Mikhail Khludnev <mkhl@apache.org> * fix(reranker): support omitting top_n Signed-off-by: Mikhail Khludnev <mkhl@apache.org> * pass 0 explicitly Signed-off-by: Mikhail Khludnev <mkhludnev@users.noreply.github.com> --------- Signed-off-by: Mikhail Khludnev <mkhl@apache.org> Signed-off-by: Mikhail Khludnev <mkhludnev@users.noreply.github.com>	2025-11-09 18:40:32 +01:00
Ettore Di Giacinto	2f2f9beee7	fix(chatterbox): pin numpy (#7198 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-11-08 16:52:22 +01:00
Mikhail Khludnev	122e4c7094	fix(reranker): reproduce ignoring top_n (#7025 ) * fix(reranker): reproduce ignoring top_n Signed-off-by: Mikhail Khludnev <mkhl@apache.org> * fix(reranker): ignoring top_n Signed-off-by: Mikhail Khludnev <mkhl@apache.org> --------- Signed-off-by: Mikhail Khludnev <mkhl@apache.org>	2025-11-06 10:03:05 +00:00
Lukas Schaefer	d95d4992fe	feat: return complete audio for kokoro (#6842 ) Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>	2025-10-28 08:49:18 +01:00

1 2 3 4 5 ...

436 Commits