diff --git a/gallery/index.yaml b/gallery/index.yaml index 88871d300..96ca4e773 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1,4 +1,54 @@ --- +- name: "qwopus3.6-27b-v2-mtp" + url: "github:mudler/LocalAI/gallery/virtual.yaml@master" + urls: + - https://huggingface.co/Jackrong/Qwopus3.6-27B-v2-MTP-GGUF + description: | + ๐Ÿช Qwopus3.6-27B-v2-MTP + MTP Release + + Multi-Token Prediction reasoning model fine-tuned from Qwen3.6-27B + + ๐Ÿงฌ Trace Inversion & Negentropy + ๐Ÿง  27B Parameters + โšก Speculative Decoding + ๐Ÿ› ๏ธ Coding / DevOps / Math + + ๐Ÿ’ก What is Qwopus3.6-27B-v2-MTP? + ๐Ÿช Qwopus3.6-27B-v2-MTP is a speed-oriented reasoning release built on top of Qwen3.6-27B. It keeps the Qwopus line's focus on reconstructed reasoning traces, coding discipline, DevOps procedures, and mathematical derivations, while adding Multi-Token Prediction for faster generation. The goal is simple: preserve the depth and structure of a 27B reasoning model while making real interactive use noticeably faster. + + โšก MTP DecodingAuxiliary future-token prediction improves throughput on long reasoning, code, math, and strict-format prompts. + ๐Ÿงฉ Structured ReasoningInherits the Qwopus training recipe built around reconstructed step-by-step reasoning trajectories. + ๐Ÿงช GB10 TestedValidated on a 30-question local benchmark across Logic, Coding, DevOps, Math, and Edge tasks. + ๐Ÿš€ Practical SpeedDesigned for workflows where strong answers matter, but waiting several extra minutes per task does not. + + ... + license: "apache-2.0" + tags: + - llm + - gguf + - reasoning + overrides: + backend: llama-cpp + function: + automatic_tool_parsing_fallback: true + grammar: + disable: true + known_usecases: + - chat + options: + - use_jinja:true + - spec_type:draft-mtp + - spec_n_max:6 + - spec_p_min:0.75 + parameters: + model: llama-cpp/models/Qwopus3.6-27B-v2-MTP-GGUF/Qwopus3.6-27B-v2-MTP-Q4_K_M.gguf + template: + use_tokenizer_template: true + files: + - filename: llama-cpp/models/Qwopus3.6-27B-v2-MTP-GGUF/Qwopus3.6-27B-v2-MTP-Q4_K_M.gguf + sha256: 818d68223be4d8518dac0b3b5604dde633cbbcbae1f491d842a3e26711c6606d + uri: https://huggingface.co/Jackrong/Qwopus3.6-27B-v2-MTP-GGUF/resolve/main/Qwopus3.6-27B-v2-MTP-Q4_K_M.gguf - name: "qwen3.6-40b-claude-4.6-opus-deckard-heretic-uncensored-thinking-neo-code-di-imatrix-max" url: "github:mudler/LocalAI/gallery/virtual.yaml@master" urls: