* chore(transformers): bump to >5.0
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: refactor to use generic model loading
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: wire min_p
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: inferencing defaults
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(refactor): re-use iterative parser
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: generate automatically inference defaults from unsloth
Instead of trying to re-invent the wheel and maintain here the inference
defaults, prefer to consume unsloth ones, and contribute there as
necessary.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: apply defaults also to models installed via gallery
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: be consistent and apply fallback to all endpoint
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: add fine-tuning endpoint
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(experimental): add fine-tuning endpoint and TRL support
This changeset defines new GRPC signatues for Fine tuning backends, and
add TRL backend as initial fine-tuning engine. This implementation also
supports exporting to GGUF and automatically importing it to LocalAI
after fine-tuning.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* commit TRL backend, stop by killing process
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* move fine-tune to generic features
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add evals, reorder menu
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(acestep-cpp): resolve relative model paths in options
The acestep-cpp backend was failing to load models because the model
paths in options (text_encoder_model, dit_model, vae_model) were being
passed to the C++ code without resolving their relative paths.
When a user configures acestep-cpp-turbo-4b, the model paths are specified
as relative paths like 'acestep-cpp/acestep-v15-turbo-Q8_0.gguf'. The
backend was passing these paths directly to the C++ code without joining
them with the model directory.
This fix:
1. Gets the base directory from the ModelFile path
2. Resolves all relative paths in options to be absolute paths
3. Adds debug logging to show resolved paths for troubleshooting
Fixes#8991
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* test: fix acestep tests to not join modeldir in options
According to code review feedback, the Options array in TestLoadModel
and TestSoundGeneration should contain just the model filenames without
filepath.Join with modelDir. The model paths are handled internally by
the backend.
* fix: change bpm parameter type to float32 to match C++ API signature
* test: fix TestLoadModel and TestSoundGeneration to use baseDir for model paths
- Modified TestLoadModel to compute baseDir from main model path and use it for relative model paths
- Modified TestSoundGeneration similarly to use baseDir for model paths
- Changed bpm parameter type from int32 to float32 to match C++ API
* Apply suggestions from code review
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* feat(realtime): WebRTC support
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(tracing): Show full LLM opts and deltas
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* Remove HuggingFace backend support, restore other backends
- Removed backend/go/huggingface directory and all related files
- Removed pkg/langchain/huggingface.go
- Removed LCHuggingFaceBackend from pkg/model/initializers.go
- Removed huggingface backend entries from backend/index.yaml
- Updated backend/README.md to remove HuggingFace backend reference
- Restored kitten-tts, local-store, silero-vad, piper backends that were incorrectly removed
This change removes only HuggingFace backend support from LocalAI
as per the P0 priority request in issue #8963, while preserving
other backends (kitten-tts, local-store, silero-vad, piper).
Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
* Remove huggingface backend from test.yml build command
The tests-linux CI job was failing because it was trying to build the
huggingface backend which no longer exists after the backend removal.
This removes huggingface from the build command in test.yml.
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* fix: add missing bufio.Flush in processImageFile
The processImageFile function writes decoded image data (from base64
or URL download) through a bufio.NewWriter but never calls Flush()
before closing the underlying file. Since bufio's default buffer is
4096 bytes, small images produce 0-byte files and large images are
truncated — causing PIL to fail with "cannot identify image file".
This breaks all image input paths: file, files, and ref_images
parameters in /v1/images/generations, making img2img, inpainting,
and reference image features non-functional.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: merge options into kwargs in diffusers GenerateImage
The GenerateImage method builds a local `options` dict containing the
source image (PIL), negative_prompt, and num_inference_steps, but
never merges it into `kwargs` before calling self.pipe(**kwargs).
This causes img2img to fail with "Input is in incorrect format"
because the pipeline never receives the image parameter.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: add unit test for processImageFile base64 decoding
Verifies that a base64-encoded PNG survives the write path
(encode → decode → bufio.Write → Flush → file on disk) with
byte-for-byte fidelity. The test image is small enough to fit
entirely in bufio's 4096-byte buffer, which is the exact scenario
where the missing Flush() produced a 0-byte file.
Also tests that invalid base64 input is handled gracefully.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: verify GenerateImage merges options into pipeline kwargs
Mocks the diffusers pipeline and calls GenerateImage with a source
image and negative prompt. Asserts that the pipeline receives the
image, negative_prompt, and num_inference_steps via kwargs — the
exact parameters that were silently dropped before the fix.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: move kwargs.update(options) earlier in GenerateImage
Move the options merge right after self.options merge (L742) so that
image, negative_prompt, and num_inference_steps are available to all
downstream code paths including img2vid and txt2vid.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: convert processImageFile tests to ginkgo
Replace standard testing with ginkgo/gomega to be consistent with
the rest of the test suites in the project.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
---------
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* feat(mlx-distributed): add new MLX-distributed backend
Add new MLX distributed backend with support for both TCP and RDMA for
model sharding.
This implementation ties in the discovery implementation already in
place, and re-uses the same P2P mechanism for the TCP MLX-distributed
inferencing.
The Auto-parallel implementation is inspired by Exo's
ones (who have been added to acknowledgement for the great work!)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose a CLI to facilitate backend starting
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: make manual rank0 configurable via model configs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add missing features from mlx backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Apply suggestion from @mudler
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* feat(functions): add peg-based parsing
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: support returning toolcalls directly from backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: do run PEG only if backend didn't send deltas
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: ⬆️ update stable-diffusion.cpp to `c8fb3d245858d495be1f140efdcfaa0d49de41e5`
Update stablediffusion-ggml to include fix for SD1 Pix2Pix issue
(leejet/stable-diffusion.cpp#1329).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: localai-bot <localai-bot@noreply.github.com>
* fix: address CI failures in stablediffusion update
Signed-off-by: localai-bot <localai-bot@noreply.github.com>
* fix: resolve remaining CI failures in stablediffusion update
- Move flow_shift to global scope so gen_image() can access the value
set during load_model() (was causing compilation error)
- Fix sd_type_str array: TQ1_0 should be at index 34, TQ2_0 at index 35
to match upstream SD_TYPE_TQ1_0=34, SD_TYPE_TQ2_0=35 enum values
Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Don't pass instruct because it is added to kwargs
Fixes the error `qwen_tts.inference.qwen3_tts_model.Qwen3TTSModel.generate_voice_design() got multiple values for keyword argument 'instruct'`
Signed-off-by: Weathercold <weathercold.scr@proton.me>