chore: ⬆️ Update ggml-org/llama.cpp to 31c511a968348281e11d590446bb815048a1e912 (#6970 )

⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
chore: fix linting issues
2026-02-04 11:42:57 -05:00 · 2025-10-31 21:04:53 +00:00 · 2025-10-31 19:08:34 +01:00 · 2025-10-31 19:04:10 +01:00 · 2025-10-31 19:03:23 +01:00 · 2025-10-31 19:02:22 +01:00
7 changed files with 586 additions and 10 deletions
--- a/.github/workflows/bump_deps.yaml
+++ b/.github/workflows/bump_deps.yaml
@@ -1,10 +1,10 @@
-name: Bump dependencies
+name: Bump Backend dependencies
 on:
  schedule:
    - cron: 0 20 * * *
  workflow_dispatch:
 jobs:
-  bump:
+  bump-backends:
    strategy:
      fail-fast: false
      matrix:
--- a/.github/workflows/bump_docs.yaml
+++ b/.github/workflows/bump_docs.yaml
@@ -1,10 +1,10 @@
-name: Bump dependencies
+name: Bump Documentation
 on:
  schedule:
    - cron: 0 20 * * *
  workflow_dispatch:
 jobs:
-  bump:
+  bump-docs:
    strategy:
      fail-fast: false
      matrix:
--- a/backend/cpp/llama-cpp/Makefile
+++ b/backend/cpp/llama-cpp/Makefile
@@ -1,5 +1,5 @@

-LLAMA_VERSION?=5a4ff43e7dd049e35942bc3d12361dab2f155544
+LLAMA_VERSION?=31c511a968348281e11d590446bb815048a1e912
 LLAMA_REPO?=https://github.com/ggerganov/llama.cpp

 CMAKE_ARGS?=
--- a/backend/go/whisper/Makefile
+++ b/backend/go/whisper/Makefile
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)

 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggml-org/whisper.cpp
-WHISPER_CPP_VERSION?=f16c12f3f55f5bd3d6ac8cf2f31ab90a42c884d5
+WHISPER_CPP_VERSION?=c62adfbd1ecdaea9e295c72d672992514a2d887c
 SO_TARGET?=libgowhisper.so

 CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
--- a/core/gallery/gallery.go
+++ b/core/gallery/gallery.go
@@ -61,12 +61,15 @@ func (gm GalleryElements[T]) Search(term string) GalleryElements[T] {
 	term = strings.ToLower(term)
 	for _, m := range gm {
 		if fuzzy.Match(term, strings.ToLower(m.GetName())) ||
-			fuzzy.Match(term, strings.ToLower(m.GetDescription())) ||
 			fuzzy.Match(term, strings.ToLower(m.GetGallery().Name)) ||
+			strings.Contains(strings.ToLower(m.GetName()), term) ||
+			strings.Contains(strings.ToLower(m.GetDescription()), term) ||
+			strings.Contains(strings.ToLower(m.GetGallery().Name), term) ||
 			strings.Contains(strings.ToLower(strings.Join(m.GetTags(), ",")), term) {
 			filteredModels = append(filteredModels, m)
 		}
 	}
+
 	return filteredModels
 }

--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -1,4 +1,186 @@
 ---
+- &qwen3vl
+  url: "github:mudler/LocalAI/gallery/qwen3.yaml@master"
+  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/620760a26e3b7210c2ff1943/-s1gyJfvbE1RgO5iBeNOi.png
+  license: apache-2.0
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - image-to-text
+    - multimodal
+    - cpu
+    - qwen
+    - qwen3
+    - thinking
+    - reasoning
+  name: "qwen3-vl-30b-a3b-instruct"
+  urls:
+    - https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Instruct-GGUF
+  description: |
+    Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.
+
+    This generation delivers comprehensive upgrades across the board: superior text understanding & generation, deeper visual perception & reasoning, extended context length, enhanced spatial and video dynamics comprehension, and stronger agent interaction capabilities.
+
+    Available in Dense and MoE architectures that scale from edge to cloud, with Instruct and reasoning‑enhanced Thinking editions for flexible, on-demand deployment.
+
+    #### Key Enhancements:
+
+    * **Visual Agent**: Operates PC/mobile GUIs—recognizes elements, understands functions, invokes tools, completes tasks.
+
+    * **Visual Coding Boost**: Generates Draw.io/HTML/CSS/JS from images/videos.
+
+    * **Advanced Spatial Perception**: Judges object positions, viewpoints, and occlusions; provides stronger 2D grounding and enables 3D grounding for spatial reasoning and embodied AI.
+
+    * **Long Context & Video Understanding**: Native 256K context, expandable to 1M; handles books and hours-long video with full recall and second-level indexing.
+
+    * **Enhanced Multimodal Reasoning**: Excels in STEM/Math—causal analysis and logical, evidence-based answers.
+
+    * **Upgraded Visual Recognition**: Broader, higher-quality pretraining is able to “recognize everything”—celebrities, anime, products, landmarks, flora/fauna, etc.
+
+    * **Expanded OCR**: Supports 32 languages (up from 19); robust in low light, blur, and tilt; better with rare/ancient characters and jargon; improved long-document structure parsing.
+
+    * **Text Understanding on par with pure LLMs**: Seamless text–vision fusion for lossless, unified comprehension.
+
+    #### Model Architecture Updates:
+
+    1. **Interleaved-MRoPE**: Full‑frequency allocation over time, width, and height via robust positional embeddings, enhancing long‑horizon video reasoning.
+
+    2. **DeepStack**: Fuses multi‑level ViT features to capture fine-grained details and sharpen image–text alignment.
+
+    3. **Text–Timestamp Alignment:** Moves beyond T‑RoPE to precise, timestamp‑grounded event localization for stronger video temporal modeling.
+
+    This is the weight repository for Qwen3-VL-30B-A3B-Instruct.
+  overrides:
+    mmproj: mmproj/mmproj-F16.gguf
+    parameters:
+      model: Qwen3-VL-30B-A3B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VL-30B-A3B-Instruct-Q4_K_M.gguf
+      sha256: 75d8f4904016d90b71509c8576ebd047a0606cc5aa788eada29d4bedf9b761a6
+      uri: huggingface://unsloth/Qwen3-VL-30B-A3B-Instruct-GGUF/Qwen3-VL-30B-A3B-Instruct-Q4_K_M.gguf
+    - filename: mmproj/mmproj-F16.gguf
+      sha256: 7e7cec67a3a887bddbf38099738d08570e85f08dd126578fa00a7acf4dacef01
+      uri: huggingface://unsloth/Qwen3-VL-30B-A3B-Instruct-GGUF/mmproj-F16.gguf
+- !!merge <<: *qwen3vl
+  name: "qwen3-vl-30b-a3b-thinking"
+  urls:
+    - https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Thinking-GGUF
+  description: |
+    Qwen3-VL-30B-A3B-Thinking is a 30B parameter model that is thinking.
+  overrides:
+    mmproj: mmproj/mmproj-F16.gguf
+    parameters:
+      model: Qwen3-VL-30B-A3B-Thinking-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VL-30B-A3B-Thinking-Q4_K_M.gguf
+      sha256: d3e12c6b15f59cc1c6db685d33eb510184d006ebbff0e038e7685e57ce628b3b
+      uri: huggingface://unsloth/Qwen3-VL-30B-A3B-Thinking-GGUF/Qwen3-VL-30B-A3B-Thinking-Q4_K_M.gguf
+    - filename: mmproj/mmproj-F16.gguf
+      sha256: 7e7cec67a3a887bddbf38099738d08570e85f08dd126578fa00a7acf4dacef01
+      uri: huggingface://unsloth/Qwen3-VL-30B-A3B-Thinking-GGUF/mmproj-F16.gguf
+- !!merge <<: *qwen3vl
+  name: "qwen3-vl-4b-instruct"
+  urls:
+    - https://huggingface.co/unsloth/Qwen3-VL-4B-Instruct-GGUF
+  description: |
+    Qwen3-VL-4B-Instruct is the 4B parameter model of the Qwen3-VL series.
+  overrides:
+    mmproj: mmproj/mmproj-Qwen3-VL-4B-Instruct-F16.gguf
+    parameters:
+      model: Qwen3-VL-4B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VL-4B-Instruct-Q4_K_M.gguf
+      sha256: d4dcd426bfba75752a312b266b80fec8136fbaca13c62d93b7ac41fa67f0492b
+      uri: huggingface://unsloth/Qwen3-VL-4B-Instruct-GGUF/Qwen3-VL-4B-Instruct-Q4_K_M.gguf
+    - filename: mmproj/mmproj-Qwen3-VL-4B-Instruct-F16.gguf
+      sha256: 1b9f4e92f0fbda14d7d7b58baed86039b8a980fe503d9d6a9393f25c0028f1fc
+      uri: huggingface://unsloth/Qwen3-VL-4B-Instruct-GGUF/mmproj-F16.gguf
+- !!merge <<: *qwen3vl
+  name: "qwen3-vl-32b-instruct"
+  urls:
+    - https://huggingface.co/unsloth/Qwen3-VL-32B-Instruct-GGUF
+  description: |
+    Qwen3-VL-32B-Instruct is the 32B parameter model of the Qwen3-VL series.
+  overrides:
+    mmproj: mmproj/mmproj-Qwen3-VL-32B-Instruct-F16.gguf
+    parameters:
+      model: Qwen3-VL-32B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VL-32B-Instruct-Q4_K_M.gguf
+      sha256: 17885d28e964b22b2faa981a7eaeeeb78da0972ee5f826ad5965f7583a610d9f
+      uri: huggingface://unsloth/Qwen3-VL-32B-Instruct-GGUF/Qwen3-VL-32B-Instruct-Q4_K_M.gguf
+    - filename: mmproj/mmproj-Qwen3-VL-32B-Instruct-F16.gguf
+      sha256: 14b1d68befa75a5e646dd990c5bb429c912b7aa9b49b9ab18231ca5f750421c9
+      uri: huggingface://unsloth/Qwen3-VL-32B-Instruct-GGUF/mmproj-F16.gguf
+- !!merge <<: *qwen3vl
+  name: "qwen3-vl-4b-thinking"
+  urls:
+    - https://huggingface.co/unsloth/Qwen3-VL-4B-Thinking-GGUF
+  description: |
+    Qwen3-VL-4B-Thinking is the 4B parameter model of the Qwen3-VL series that is thinking.
+  overrides:
+    mmproj: mmproj/mmproj-Qwen3-VL-4B-Thinking-F16.gguf
+    parameters:
+      model: Qwen3-VL-4B-Thinking-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VL-4B-Thinking-Q4_K_M.gguf
+      sha256: bd73237f16265a1014979b7ed34ff9265e7e200ae6745bb1da383a1bbe0f9211
+      uri: huggingface://unsloth/Qwen3-VL-4B-Thinking-GGUF/Qwen3-VL-4B-Thinking-Q4_K_M.gguf
+    - filename: mmproj/mmproj-Qwen3-VL-4B-Thinking-F16.gguf
+      sha256: 72354fcd3fc75935b84e745ca492d6e78dd003bb5a020d71b296e7650926ac87
+      uri: huggingface://unsloth/Qwen3-VL-4B-Thinking-GGUF/mmproj-F16.gguf
+- !!merge <<: *qwen3vl
+  name: "qwen3-vl-2b-thinking"
+  urls:
+    - https://huggingface.co/unsloth/Qwen3-VL-2B-Thinking-GGUF
+  description: |
+    Qwen3-VL-2B-Thinking is the 2B parameter model of the Qwen3-VL series that is thinking.
+  overrides:
+    mmproj: mmproj/mmproj-Qwen3-VL-2B-Thinking-F16.gguf
+    parameters:
+      model: Qwen3-VL-2B-Thinking-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VL-2B-Thinking-Q4_K_M.gguf
+      sha256: 5f282086042d96b78b138839610f5148493b354524090fadc5c97c981b70a26e
+      uri: huggingface://unsloth/Qwen3-VL-2B-Thinking-GGUF/Qwen3-VL-2B-Thinking-Q4_K_M.gguf
+    - filename: mmproj/mmproj-Qwen3-VL-2B-Thinking-F16.gguf
+      sha256: 4eabc90a52fe890d6ca1dad92548782eab6edc91f012a365fff95cf027ba529d
+      uri: huggingface://unsloth/Qwen3-VL-2B-Thinking-GGUF/mmproj-F16.gguf
+- !!merge <<: *qwen3vl
+  name: "qwen3-vl-2b-instruct"
+  urls:
+    - https://huggingface.co/unsloth/Qwen3-VL-2B-Instruct-GGUF
+  description: |
+    Qwen3-VL-2B-Instruct is the 2B parameter model of the Qwen3-VL series.
+  overrides:
+    mmproj: mmproj/mmproj-Qwen3-VL-2B-Instruct-F16.gguf
+    parameters:
+      model: Qwen3-VL-2B-Instruct-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VL-2B-Instruct-Q4_K_M.gguf
+      sha256: 858fcf2a39dc73b26dd86592cb0a5f949b59d1edb365d1dea98e46b02e955e56
+      uri: huggingface://unsloth/Qwen3-VL-2B-Instruct-GGUF/Qwen3-VL-2B-Instruct-Q4_K_M.gguf
+    - filename: mmproj/mmproj-Qwen3-VL-2B-Instruct-F16.gguf
+      sha256: cd5a851d3928697fa1bd76d459d2cc409b6cf40c9d9682b2f5c8e7c6a9f9630f
+      uri: huggingface://unsloth/Qwen3-VL-2B-Instruct-GGUF/mmproj-F16.gguf
+- !!merge <<: *qwen3vl
+  name: "huihui-qwen3-vl-30b-a3b-instruct-abliterated"
+  urls:
+    - https://huggingface.co/noctrex/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-GGUF
+  description: |
+    These are quantizations of the model Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-GGUF
+  overrides:
+    mmproj: mmproj/mmproj-Huihui-Qwen3-VL-30B-A3B-F16.gguf
+    parameters:
+      model: Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-Q4_K_M.gguf
+  files:
+    - filename: Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-Q4_K_M.gguf
+      sha256: 1e94a65167a39d2ff4427393746d4dbc838f3d163c639d932e9ce983f575eabf
+      uri: huggingface://noctrex/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-GGUF/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-Q4_K_M.gguf
+    - filename: mmproj/mmproj-Huihui-Qwen3-VL-30B-A3B-F16.gguf
+      sha256: 4bfd655851a5609b29201154e0bd4fe5f9274073766b8ab35b3a8acba0dd77a7
+      uri: huggingface://noctrex/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated-GGUF/mmproj-F16.gguf
 - &jamba
  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/65e60c0ed5313c06372446ff/QwehUHgP2HtVAMW5MzJ2j.png
  name: "ai21labs_ai21-jamba-reasoning-3b"
@@ -22795,3 +22977,389 @@
    - filename: wraith-8b.i1-Q4_K_M.gguf
      sha256: 180469f9de3e1b5a77b7cf316899dbe4782bd5e6d4f161fb18ea95aa612e6926
      uri: huggingface://mradermacher/wraith-8b-i1-GGUF/wraith-8b.i1-Q4_K_M.gguf
+- !!merge <<: *qwen25
+  name: "pokee_research_7b"
+  urls:
+    - https://huggingface.co/Mungert/pokee_research_7b-GGUF
+  description: |
+    **Model Name:** Qwen2.5-7B-Instruct
+    **Base Model:** Qwen/Qwen2.5-7B
+    **Model Type:** Instruction-tuned large language model (7.61B parameters)
+    **License:** Apache 2.0
+
+    **Description:**
+    Qwen2.5-7B-Instruct is a powerful, instruction-following language model designed for advanced reasoning, coding, and multi-turn dialogue. Built on the Qwen2.5 architecture, it delivers state-of-the-art performance in understanding complex prompts, generating long-form text (up to 8K tokens), and handling structured outputs like JSON. It supports multilingual communication (29+ languages), including English, Chinese, and European languages, and excels in long-context tasks with support for up to 131,072 tokens.
+
+    Ideal for research, creative writing, coding assistance, and agent-based workflows, this model is optimized for real-world applications requiring robustness, accuracy, and scalability.
+
+    **Key Features:**
+    - 7.61 billion parameters
+    - Context length: 131K tokens (supports long-context via YaRN)
+    - Strong performance in math, coding, and factual reasoning
+    - Fine-tuned for instruction following and chat interactions
+    - Deployable with Hugging Face Transformers, vLLM, and llama.cpp
+
+    **Use Case:**
+    Perfect for developers, researchers, and enterprises building intelligent assistants, autonomous agents, or content generation systems.
+
+    **Citation:**
+    ```bibtex
+    @misc{qwen2.5,
+        title = {Qwen2.5: A Party of Foundation Models},
+        url = {https://qwenlm.github.io/blog/qwen2.5/},
+        author = {Qwen Team},
+        month = {September},
+        year = {2024}
+    }
+    ```
+  overrides:
+    parameters:
+      model: pokee_research_7b-q4_k_m.gguf
+  files:
+    - filename: pokee_research_7b-q4_k_m.gguf
+      sha256: 670706711d82fcdbae951fda084f77c9c479edf3eb5d8458d1cfddd46cf4b767
+      uri: huggingface://Mungert/pokee_research_7b-GGUF/pokee_research_7b-q4_k_m.gguf
+- !!merge <<: *qwen3
+  name: "deepkat-32b-i1"
+  urls:
+    - https://huggingface.co/mradermacher/DeepKAT-32B-i1-GGUF
+  description: |
+    **DeepKAT-32B** is a high-performance, open-source coding agent built by merging two leading RL-tuned models—**DeepSWE-Preview** and **KAT-Dev**—on the **Qwen3-32B** base architecture using Arcee MergeKit’s TIES method. This 32B parameter model excels in complex software engineering tasks, including code generation, bug fixing, refactoring, and autonomous agent workflows with tool use.
+
+    Key strengths:
+    - Achieves ~62% SWE-Bench Verified score (on par with top open-source models).
+    - Strong performance in multi-file reasoning, multi-turn planning, and sparse reward environments.
+    - Optimized for agentic behavior with step-by-step reasoning and tool chaining.
+
+    Ideal for developers, AI researchers, and teams building intelligent code assistants or autonomous software agents.
+
+    > 🔗 **Base Model**: Qwen/Qwen3-32B
+    > 🛠️ **Built With**: MergeKit (TIES), RL-finetuned components
+    > 📊 **Benchmarks**: SWE-Bench Verified: ~62%, HumanEval Pass@1: ~85%
+
+    *Note: The model is a merge of two RL-tuned models and not a direct training from scratch.*
+  overrides:
+    parameters:
+      model: mradermacher/DeepKAT-32B-i1-GGUF
+- !!merge <<: *granite4
+  name: "ibm-granite.granite-4.0-1b"
+  urls:
+    - https://huggingface.co/DevQuasar/ibm-granite.granite-4.0-1b-GGUF
+  description: |
+    ### **Granite-4.0-1B**
+    *By IBM | Apache 2.0 License*
+
+    **Overview:**
+    Granite-4.0-1B is a lightweight, instruction-tuned language model designed for efficient on-device and research use. Built on a decoder-only dense transformer architecture, it delivers strong performance in instruction following, code generation, tool calling, and multilingual tasks—making it ideal for applications requiring low latency and minimal resource usage.
+
+    **Key Features:**
+    - **Size:** 1.6 billion parameters (1B Dense), optimized for efficiency.
+    - **Capabilities:**
+      - Text generation, summarization, question answering
+      - Code completion and function calling (e.g., API integration)
+      - Multilingual support (English, Spanish, French, German, Japanese, Chinese, Arabic, Korean, Portuguese, Italian, Dutch, Czech)
+      - Robust safety and alignment via instruction tuning and reinforcement learning
+    - **Architecture:** Uses GQA (Grouped Query Attention), SwiGLU activation, RMSNorm, shared input/output embeddings, and RoPE position embeddings.
+    - **Context Length:** Up to 128K tokens — suitable for long-form content and complex reasoning.
+    - **Training:** Finetuned from *Granite-4.0-1B-Base* using open-source datasets, synthetic data, and human-curated instruction pairs.
+
+    **Performance Highlights (1B Dense):**
+    - **MMLU (5-shot):** 59.39
+    - **HumanEval (pass@1):** 74
+    - **IFEval (Alignment):** 80.82
+    - **GSM8K (8-shot):** 76.35
+    - **SALAD-Bench (Safety):** 93.44
+
+    **Use Cases:**
+    - On-device AI applications
+    - Research and prototyping
+    - Fine-tuning for domain-specific tasks
+    - Low-resource environments with high performance expectations
+
+    **Resources:**
+    - [Hugging Face Model](https://huggingface.co/ibm-granite/granite-4.0-1b)
+    - [Granite Docs](https://www.ibm.com/granite/docs/)
+    - [GitHub Repository](https://github.com/ibm-granite/granite-4.0-nano-language-models)
+
+    > *“Make knowledge free for everyone.” – IBM Granite Team*
+  overrides:
+    parameters:
+      model: ibm-granite.granite-4.0-1b.Q4_K_M.gguf
+  files:
+    - filename: ibm-granite.granite-4.0-1b.Q4_K_M.gguf
+      sha256: 0e0ef42486b7f1f95dfe33af2e696df1149253e500c48f3fb8db0125afa2922c
+      uri: huggingface://DevQuasar/ibm-granite.granite-4.0-1b-GGUF/ibm-granite.granite-4.0-1b.Q4_K_M.gguf
+- !!merge <<: *qwen3
+  name: "apollo-astralis-4b-i1"
+  urls:
+    - https://huggingface.co/mradermacher/apollo-astralis-4b-i1-GGUF
+  description: |
+    **Apollo-Astralis V1 4B**
+    *A warm, enthusiastic, and empathetic reasoning model built on Qwen3-4B-Thinking*
+
+    **Overview**
+    Apollo-Astralis V1 4B is a 4-billion-parameter conversational AI designed for collaborative, emotionally intelligent problem-solving. Developed by VANTA Research, it combines rigorous logical reasoning with a vibrant, supportive communication style—making it ideal for creative brainstorming, educational support, and personal development.
+
+    **Key Features**
+    - 🤔 **Explicit Reasoning**: Uses `</tool_call>` tags to break down thought processes step by step
+    - 💬 **Warm & Enthusiastic Tone**: Celebrates achievements with energy and empathy
+    - 🤝 **Collaborative Style**: Engages users with "we" language and clarifying questions
+    - 🔍 **High Accuracy**: Achieves 100% in enthusiasm detection and 90% in empathy recognition
+    - 🎯 **Fine-Tuned for Real-World Use**: Trained with LoRA on a dataset emphasizing emotional intelligence and consistency
+
+    **Base Model**
+    Built on **Qwen3-4B-Thinking** and enhanced with lightweight LoRA fine-tuning (33M trainable parameters).
+    Available in both full and quantized (GGUF) formats via Hugging Face and Ollama.
+
+    **Use Cases**
+    - Personal coaching & motivation
+    - Creative ideation & project planning
+    - Educational tutoring with emotional support
+    - Mental wellness conversations (complementary, not替代)
+
+    **License**
+    Apache 2.0 — open for research, commercial, and personal use.
+
+    **Try It**
+    👉 [Hugging Face Page](https://huggingface.co/VANTA-Research/apollo-astralis-v1-4b)
+    👉 [Ollama](https://ollama.com/vanta-research/apollo-astralis-v1-4b)
+
+    *Developed by VANTA Research — where reasoning meets warmth.*
+  overrides:
+    parameters:
+      model: apollo-astralis-4b.i1-Q4_K_M.gguf
+  files:
+    - filename: apollo-astralis-4b.i1-Q4_K_M.gguf
+      sha256: 94e1d371420b03710fc7de030c1c06e75a356d9388210a134ee2adb4792a2626
+      uri: huggingface://mradermacher/apollo-astralis-4b-i1-GGUF/apollo-astralis-4b.i1-Q4_K_M.gguf
+- !!merge <<: *qwen3
+  name: "qwen3-vlto-32b-instruct-i1"
+  urls:
+    - https://huggingface.co/mradermacher/Qwen3-VLTO-32B-Instruct-i1-GGUF
+  description: |
+    **Model Name:** Qwen3-VL-32B-Instruct (Text-Only Variant: Qwen3-VLTO-32B-Instruct)
+    **Base Model:** Qwen/Qwen3-VL-32B-Instruct
+    **Repository:** [mradermacher/Qwen3-VLTO-32B-Instruct-i1-GGUF](https://huggingface.co/mradermacher/Qwen3-VLTO-32B-Instruct-i1-GGUF)
+    **Type:** Large Language Model (LLM) – Text-Only (Vision-Language model stripped of vision components)
+    **Architecture:** Qwen3-VL, adapted for pure text generation
+    **Size:** 32 billion parameters
+    **License:** Apache 2.0
+    **Framework:** Hugging Face Transformers
+
+    ---
+
+    ### 🔍 **Description**
+
+    This is a **text-only variant** of the powerful **Qwen3-VL-32B-Instruct** multimodal model, stripped of its vision components to function as a high-performance pure language model. The model retains the full text understanding and generation capabilities of its parent — including strong reasoning, long-context handling (up to 32K+ tokens), and advanced multimodal training-derived coherence — while being optimized for text-only tasks.
+
+    It was created by loading the weights from the full Qwen3-VL-32B-Instruct model into a text-only Qwen3 architecture, preserving all linguistic and reasoning strengths without the need for image input.
+
+    Perfect for applications requiring deep reasoning, long-form content generation, code synthesis, and dialogue — with all the benefits of the Qwen3 series, now in a lightweight, text-focused form.
+
+    ---
+
+    ### 📌 Key Features
+
+    - ✅ **High-Performance Text Generation** – Built on top of the state-of-the-art Qwen3-VL architecture
+    - ✅ **Extended Context Length** – Supports up to 32,768 tokens (ideal for long documents and complex tasks)
+    - ✅ **Strong Reasoning & Planning** – Excels at logic, math, coding, and multi-step reasoning
+    - ✅ **Optimized for GGUF Format** – Available in multiple quantized versions (IQ3_M, Q2_K, etc.) for efficient inference on consumer hardware
+    - ✅ **Free to Use & Modify** – Apache 2.0 license
+
+    ---
+
+    ### 📦 Use Case Suggestions
+
+    - Long-form writing, summarization, and editing
+    - Code generation and debugging
+    - AI agents and task automation
+    - High-quality chat and dialogue systems
+    - Research and experimentation with large-scale LLMs on local devices
+
+    ---
+
+    ### 📚 References
+
+    - Original Model: [Qwen/Qwen3-VL-32B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct)
+    - Technical Report: [Qwen3 Technical Report (arXiv)](https://arxiv.org/abs/2505.09388)
+    - Quantization by: [mradermacher](https://huggingface.co/mradermacher)
+
+    > ✅ **Note**: The model shown here is **not the original vision-language model** — it's a **text-only conversion** of the Qwen3-VL-32B-Instruct model, ideal for pure language tasks.
+  overrides:
+    parameters:
+      model: Qwen3-VLTO-32B-Instruct.i1-Q4_K_S.gguf
+  files:
+    - filename: Qwen3-VLTO-32B-Instruct.i1-Q4_K_S.gguf
+      sha256: 789d55249614cd1acee1a23278133cd56ca898472259fa2261f77d65ed7f8367
+      uri: huggingface://mradermacher/Qwen3-VLTO-32B-Instruct-i1-GGUF/Qwen3-VLTO-32B-Instruct.i1-Q4_K_S.gguf
+- !!merge <<: *qwen3
+  name: "qwen3-vlto-32b-thinking"
+  urls:
+    - https://huggingface.co/mradermacher/Qwen3-VLTO-32B-Thinking-GGUF
+  description: |
+    **Model Name:** Qwen3-VLTO-32B-Thinking
+    **Model Type:** Large Language Model (Text-Only)
+    **Base Model:** Qwen/Qwen3-VL-32B-Thinking (vanilla Qwen3-VL-32B with vision components removed)
+    **Architecture:** Transformer-based, 32-billion parameter model optimized for reasoning and complex text generation.
+
+    ### Description:
+    Qwen3-VLTO-32B-Thinking is a pure text-only variant of the Qwen3-VL-32B-Thinking model, stripped of its vision capabilities while preserving the full reasoning and language understanding power. It is derived by transferring the weights from the vision-language model into a text-only transformer architecture, maintaining the same high-quality behavior for tasks such as logical reasoning, code generation, and dialogue.
+
+    This model is ideal for applications requiring deep linguistic reasoning and long-context understanding without image input. It supports advanced multimodal reasoning capabilities *in text form*—perfect for research, chatbots, and content generation.
+
+    ### Key Features:
+    - ✅ 32B parameters, high reasoning capability
+    - ✅ No vision components — fully text-only
+    - ✅ Trained for complex thinking and step-by-step reasoning
+    - ✅ Compatible with Hugging Face Transformers and GGUF inference tools
+    - ✅ Available in multiple quantization levels (Q2_K to Q8_0) for efficient deployment
+
+    ### Use Case:
+    Ideal for advanced text generation, logical inference, coding, and conversational AI where vision is not needed.
+
+    > 🔗 **Base Model**: [Qwen/Qwen3-VL-32B-Thinking](https://huggingface.co/Qwen/Qwen3-VL-32B-Thinking)
+    > 📦 **Quantized Versions**: Available via [mradermacher/Qwen3-VLTO-32B-Thinking-GGUF](https://huggingface.co/mradermacher/Qwen3-VLTO-32B-Thinking-GGUF)
+
+    ---
+    *Note: The original model was created by Alibaba’s Qwen team. This variant was adapted by qingy2024 and quantized by mradermacher.*
+  overrides:
+    parameters:
+      model: Qwen3-VLTO-32B-Thinking.Q4_K_M.gguf
+  files:
+    - filename: Qwen3-VLTO-32B-Thinking.Q4_K_M.gguf
+      sha256: d88b75df7c40455dfa21ded23c8b25463a8d58418bb6296304052b7e70e96954
+      uri: huggingface://mradermacher/Qwen3-VLTO-32B-Thinking-GGUF/Qwen3-VLTO-32B-Thinking.Q4_K_M.gguf
+- !!merge <<: *gemma3
+  name: "gemma-3-the-grand-horror-27b"
+  urls:
+    - https://huggingface.co/DavidAU/Gemma-3-The-Grand-Horror-27B-GGUF
+  description: |
+    The **Gemma-3-The-Grand-Horror-27B-GGUF** model is a **fine-tuned version** of Google's **Gemma 3 27B** language model, specifically optimized for **extreme horror-themed text generation**. It was trained using the **Unsloth framework** on a custom in-house dataset of horror content, resulting in a model that produces vivid, graphic, and psychologically intense narratives—featuring gore, madness, and disturbing imagery—often even when prompts don't explicitly request horror.
+
+    Key characteristics:
+    - **Base Model**: Gemma 3 27B (original by Google, not the quantized version)
+    - **Fine-tuned For**: High-intensity horror storytelling, long-form narrative generation, and immersive scene creation
+    - **Use Case**: Creative writing, horror RP, dark fiction, and experimental storytelling
+    - **Not Suitable For**: General use, children, sensitive audiences, or content requiring neutral/positive tone
+    - **Quantization**: Available in GGUF format (e.g., q3k, q4, etc.), making it accessible for local inference on consumer hardware
+
+    > ✅ **Note**: The model card you see is for a **quantized, fine-tuned derivative**, not the original. The true base model is **Gemma 3 27B**, available at: https://huggingface.co/google/gemma-3-27b
+
+    This model is not for all audiences — it generates content with a consistently dark, unsettling tone. Use responsibly.
+  overrides:
+    parameters:
+      model: Gemma-3-The-Grand-Horror-27B-Q4_k_m.gguf
+  files:
+    - filename: Gemma-3-The-Grand-Horror-27B-Q4_k_m.gguf
+      sha256: 46f0b06b785d19804a1a796bec89a8eeba8a4e2ef21e2ab8dbb8fa2ff0d675b1
+      uri: huggingface://DavidAU/Gemma-3-The-Grand-Horror-27B-GGUF/Gemma-3-The-Grand-Horror-27B-Q4_k_m.gguf
+- !!merge <<: *qwen3
+  name: "qwen3-nemotron-32b-rlbff-i1"
+  urls:
+    - https://huggingface.co/mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF
+  description: |
+    **Model Name:** Qwen3-Nemotron-32B-RLBFF
+    **Base Model:** Qwen/Qwen3-32B
+    **Developer:** NVIDIA
+    **License:** NVIDIA Open Model License
+
+    **Description:**
+    Qwen3-Nemotron-32B-RLBFF is a high-performance, fine-tuned large language model built on the Qwen3-32B foundation. It is specifically optimized to generate high-quality, helpful responses in a default thinking mode through advanced reinforcement learning with binary flexible feedback (RLBFF). Trained on the HelpSteer3 dataset, this model excels in reasoning, planning, coding, and information-seeking tasks while maintaining strong safety and alignment with human preferences.
+
+    **Key Performance (as of Sep 2025):**
+    - **MT-Bench:** 9.50 (near GPT-4-Turbo level)
+    - **Arena Hard V2:** 55.6%
+    - **WildBench:** 70.33%
+
+    **Architecture & Efficiency:**
+    - 32 billion parameters, based on the Qwen3 Transformer architecture
+    - Designed for deployment on NVIDIA GPUs (Ampere, Hopper, Turing)
+    - Achieves performance comparable to DeepSeek R1 and O3-mini at less than 5% of the inference cost
+
+    **Use Case:**
+    Ideal for applications requiring reliable, thoughtful, and safe responses—such as advanced chatbots, research assistants, and enterprise AI systems.
+
+    **Access & Usage:**
+    Available on Hugging Face with support for Hugging Face Transformers and vLLM.
+    **Cite:** [Wang et al., 2025 — RLBFF: Binary Flexible Feedback](https://arxiv.org/abs/2509.21319)
+
+    👉 *Note: The GGUF version (mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF) is a user-quantized variant. The original model is available at nvidia/Qwen3-Nemotron-32B-RLBFF.*
+  overrides:
+    parameters:
+      model: Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf
+  files:
+    - filename: Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf
+      sha256: 000e8c65299fc232d1a832f1cae831ceaa16425eccfb7d01702d73e8bd3eafee
+      uri: huggingface://mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF/Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf
+- !!merge <<: *gptoss
+  name: "financial-gpt-oss-20b-q8-i1"
+  urls:
+    - https://huggingface.co/mradermacher/financial-gpt-oss-20b-q8-i1-GGUF
+  description: |
+    ### **Financial GPT-OSS 20B (Base Model)**
+
+    **Model Type:** Causal Language Model (Fine-tuned for Financial Analysis)
+    **Architecture:** Mixture of Experts (MoE) – 20B parameters, 32 experts (4 active per token)
+    **Base Model:** `unsloth/gpt-oss-20b-unsloth-bnb-4bit`
+    **Fine-tuned With:** LoRA (Low-Rank Adaptation) on financial conversation data
+    **Training Data:** 22,250 financial dialogue pairs covering stocks (AAPL, NVDA, TSLA, etc.), technical analysis, risk assessment, and trading signals
+    **Context Length:** 131,072 tokens
+    **Quantization:** Q8_0 GGUF (for efficient inference)
+    **License:** Apache 2.0
+
+    **Key Features:**
+    - Specialized in financial market analysis: technical indicators (RSI, MACD), risk assessments, trading signals, and price forecasts
+    - Handles complex financial queries with structured, actionable insights
+    - Designed for real-time use with low-latency inference (GGUF format)
+    - Supports S&P 500 stocks and major asset classes across tech, healthcare, energy, and finance sectors
+
+    **Use Case:** Ideal for traders, analysts, and developers building financial AI tools. Use with caution—**not financial advice**.
+
+    **Citation:**
+    ```bibtex
+    @misc{financial-gpt-oss-20b-q8,
+      title={Financial GPT-OSS 20B Q8: Fine-tuned Financial Analysis Model},
+      author={beenyb},
+      year={2025},
+      publisher={Hugging Face Hub},
+      url={https://huggingface.co/beenyb/financial-gpt-oss-20b-q8}
+    }
+    ```
+  overrides:
+    parameters:
+      model: financial-gpt-oss-20b-q8.i1-Q4_K_M.gguf
+  files:
+    - filename: financial-gpt-oss-20b-q8.i1-Q4_K_M.gguf
+      sha256: 14586673de2a769f88bd51f88464b9b1f73d3ad986fa878b2e0c1473f1c1fc59
+      uri: huggingface://mradermacher/financial-gpt-oss-20b-q8-i1-GGUF/financial-gpt-oss-20b-q8.i1-Q4_K_M.gguf
+- !!merge <<: *llama3
+  name: "qwen3-grand-horror-light-1.7b"
+  urls:
+    - https://huggingface.co/mradermacher/Qwen3-Grand-Horror-Light-1.7B-GGUF
+  description: |
+    **Model Name:** Qwen3-Grand-Horror-Light-1.7B
+    **Base Model:** qingy2024/Qwen3-VLTO-1.7B-Instruct
+    **Model Type:** Fine-tuned Language Model (Text Generation)
+    **Size:** 1.7B parameters
+    **License:** Apache 2.0
+    **Language:** English
+    **Use Case:** Horror storytelling, creative writing, roleplay, scene generation
+    **Fine-Tuned On:** Custom horror dataset (`DavidAU/horror-nightmare1`)
+    **Training Method:** Fine-tuned via Unsloth
+    **Key Features:**
+    - Specialized in generating atmospheric, intense horror content with elements of madness, gore, and suspense
+    - Optimized for roleplay and narrative generation with low to medium horror intensity
+    - Supports high-quality output across multiple quantization levels (Q2_K to Q8_0, f16)
+    - Designed for use with tools like KoboldCpp, oobabooga/text-generation-webui, and Silly Tavern
+    - Recommended settings: Temperature 0.4–1.2, Repetition penalty 1.1, Smoothing factor 1.5 for smoother output
+
+    **Note:** This model is a fine-tuned variant of the Qwen3 series, not a quantized version. The original base model is available at [qingy2024/Qwen3-VLTO-1.7B-Instruct](https://huggingface.co/qingy2024/Qwen3-VLTO-1.7B-Instruct) and was further adapted for horror-themed creative writing.
+
+    **Ideal For:** Creators, writers, and roleplayers seeking a compact, expressive model for immersive horror storytelling.
+  overrides:
+    parameters:
+      model: Qwen3-Grand-Horror-Light-1.7B.Q4_K_M.gguf
+  files:
+    - filename: Qwen3-Grand-Horror-Light-1.7B.Q4_K_M.gguf
+      sha256: cbbb0c5f6874130a8ae253377fdc7ad25fa2c1e9bb45f1aaad88db853ef985dc
+      uri: huggingface://mradermacher/Qwen3-Grand-Horror-Light-1.7B-GGUF/Qwen3-Grand-Horror-Light-1.7B.Q4_K_M.gguf
--- a/gallery/qwen3.yaml
+++ b/gallery/qwen3.yaml
@@ -6,15 +6,20 @@ config_file: |
  backend: "llama-cpp"
  template:
    chat_message: |
-      <|im_start|>{{ .RoleName }}
-      {{ if .FunctionCall -}}
-      {{ else if eq .RoleName "tool" -}}
+      <|im_start|>{{if eq .RoleName "tool" }}user{{else}}{{ .RoleName }}{{end}}
+      {{ if eq .RoleName "tool" -}}
+      <tool_response>
      {{ end -}}
      {{ if .Content -}}
      {{.Content }}
      {{ end -}}
+      {{ if eq .RoleName "tool" -}}
+      </tool_response>
+      {{ end -}}
      {{ if .FunctionCall -}}
+      <tool_call>
      {{toJson .FunctionCall}}
+      </tool_call>
      {{ end -}}<|im_end|>
    function: |
      <|im_start|>system
Author	SHA1	Message	Date
LocalAI [bot]	9ecfdc5938	chore: ⬆️ Update ggml-org/llama.cpp to `31c511a968348281e11d590446bb815048a1e912` (#6970 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-31 21:04:53 +00:00
Ettore Di Giacinto	c332ef5cce	chore: fix linting issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 19:08:34 +01:00
Ettore Di Giacinto	6e7a8c6041	chore(model gallery): add qwen3-vl-2b-instruct (#6967 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 19:04:10 +01:00
Ettore Di Giacinto	43e707ec4f	chore(model gallery): add qwen3-vl-2b-thinking (#6966 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 19:03:23 +01:00
Ettore Di Giacinto	fed3663a74	chore(model gallery): add qwen3-vl-4b-thinking (#6965 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 19:02:22 +01:00
Ettore Di Giacinto	5b72798db3	chore(model gallery): add qwen3-vl-32b-instruct (#6964 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 19:01:11 +01:00
Ettore Di Giacinto	d24d6d4e93	chore(model gallery): add qwen3-vl-4b-instruct (#6963 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 18:57:50 +01:00
Ettore Di Giacinto	50ee1fbe06	chore(model gallery): add qwen3-vl-30b-a3b-thinking (#6962 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 18:53:13 +01:00
Ettore Di Giacinto	19f3425ce0	chore(model gallery): add huihui-qwen3-vl-30b-a3b-instruct-abliterated (#6961 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 18:46:25 +01:00
Ettore Di Giacinto	a6ef245534	chore(model gallery): add qwen3-vl-30b-a3b-instruct (#6960 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-31 18:37:12 +01:00
LocalAI [bot]	88cb379c2d	chore(model gallery): 🤖 add 1 new models via gallery agent (#6940 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-31 16:57:18 +01:00
LocalAI [bot]	0ddb2e8dcf	chore: ⬆️ Update ggml-org/llama.cpp to `4146d6a1a6228711a487a1e3e9ddd120f8d027d7` (#6945 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-31 14:51:03 +00:00
Ettore Di Giacinto	91b9301bec	Rename workflow from 'Bump dependencies' to 'Bump Documentation' Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-10-31 14:40:50 +01:00
Ettore Di Giacinto	fad5868f7b	Rename job to 'bump-backends' in workflow Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-10-31 14:40:34 +01:00
LocalAI [bot]	1e5b9135df	chore: ⬆️ Update ggml-org/llama.cpp to `16724b5b6836a2d4b8936a5824d2ff27c52b4517` (#6925 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-30 21:07:33 +00:00
LocalAI [bot]	36d19e23e0	chore(model gallery): 🤖 add 1 new models via gallery agent (#6921 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-30 18:58:08 +01:00
LocalAI [bot]	cba9d1aac0	chore(model gallery): 🤖 add 1 new models via gallery agent (#6919 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-30 17:26:18 +01:00
LocalAI [bot]	dd21a0d2f9	chore: ⬆️ Update ggml-org/llama.cpp to `3464bdac37027c5e9661621fc75ffcef3c19c6ef` (#6896 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-30 14:17:58 +01:00
LocalAI [bot]	302a43b3ae	chore(model gallery): 🤖 add 1 new models via gallery agent (#6911 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-30 09:54:24 +01:00
LocalAI [bot]	2955061b42	chore(model gallery): 🤖 add 1 new models via gallery agent (#6910 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-30 09:39:31 +01:00
LocalAI [bot]	84644ab693	chore(model gallery): 🤖 add 1 new models via gallery agent (#6908 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-30 09:20:23 +01:00
Ettore Di Giacinto	b8f40dde1e	feat: do also text match (#6891 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-10-29 17:18:56 +01:00
LocalAI [bot]	a6c9789a54	chore(model gallery): 🤖 add 1 new models via gallery agent (#6884 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-29 10:56:57 +01:00
LocalAI [bot]	a48d9ce27c	chore(model gallery): 🤖 add 1 new models via gallery agent (#6879 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-29 08:19:51 +01:00
LocalAI [bot]	fb825a2708	chore: ⬆️ Update ggml-org/llama.cpp to `851553ea6b24cb39fd5fd188b437d777cb411de8` (#6869 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-29 08:16:55 +01:00
LocalAI [bot]	5558dce449	chore: ⬆️ Update ggml-org/whisper.cpp to `c62adfbd1ecdaea9e295c72d672992514a2d887c` (#6868 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-28 21:12:05 +00:00
LocalAI [bot]	cf74a11e65	chore(model gallery): 🤖 add 1 new models via gallery agent (#6864 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-28 17:20:57 +01:00
LocalAI [bot]	86b5deec81	chore(model gallery): 🤖 add 1 new models via gallery agent (#6863 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-28 16:23:57 +01:00