mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-14 11:49:33 -04:00
feat(qwen3-tts-cpp): migrate to ServeurpersoCom/qwentts.cpp (streaming, speakers, voice design) (#10316)
* feat(qwen3-tts-cpp): repoint upstream to ServeurpersoCom/qwentts.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): flatten qt_* ABI into qt3_* purego shim Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): build shim against upstream qwen-core static lib Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): add option/language/voice/sampling parsing Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): add 24kHz WAV encode/decode/stream-header helpers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): purego backend with streaming, speakers, voice design Map TTSRequest onto qwentts.cpp: instructions->instruct, voice->named speaker or clone-reference path, params map->ref_text + sampling. Add TTSStream over the qt chunk callback. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * test(qwen3-tts-cpp): unit specs + build-gated TTS/TTSStream e2e Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * fix(qwen3-tts-cpp): close defensive PCM-free gap on zero-sample result Register CppPCMFree before the n<=0 guard so a non-null buffer with zero samples cannot leak (the C contract returns NULL on failure, so this is defensive). Raised in code review. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): advertise TTSStream capability Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * chore(qwen3-tts-cpp): update backend index metadata for qwentts.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(gallery): qwentts.cpp models - base/customvoice/voicedesign, Q8_0 & Q4_K_M Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * docs(qwen3-tts-cpp): release note for qwentts.cpp migration Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * test(qwen3-tts-cpp): cover audio_path voice-cloning fallback Add resolveRequest unit specs (config audio_path used as the clone reference when Voice is empty; per-request audio Voice overrides it; a named-speaker Voice does not trigger cloning) plus a real-inference e2e that clones from audio_path (confirmed ref_spk_emb=yes in the pipeline). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * chore(qwen3-tts-cpp): drop the release-note doc Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
@@ -3304,38 +3304,267 @@
|
||||
- filename: vibevoice-cpp-asr/tokenizer.gguf
|
||||
sha256: 37dc3b722d5677e37e29a57df55aa05c485116eeb5459e57ff8dde616b4986f6
|
||||
uri: huggingface://mudler/vibevoice.cpp-models/tokenizer.gguf
|
||||
- name: qwen3-tts-cpp
|
||||
- &qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://huggingface.co/endo5501/qwen3-tts.cpp
|
||||
- https://github.com/predict-woo/qwen3-tts.cpp
|
||||
- https://huggingface.co/Serveurperso/Qwen3-TTS-GGUF
|
||||
- https://github.com/ServeurpersoCom/qwentts.cpp
|
||||
description: |
|
||||
Qwen3-TTS 0.6B (C++ / GGML) — native C++ text-to-speech from text input.
|
||||
Generates 24kHz mono audio. Supports 10 languages (en, zh, ja, ko, de, fr, es, it, pt, ru).
|
||||
Uses F16 GGUF models (~2 GB total).
|
||||
license: apache-2.0
|
||||
Qwen3-TTS 0.6B Base (C++ / GGML, qwentts.cpp). Native C++ text-to-speech with
|
||||
streaming output and zero-shot voice cloning (set `voice` to a 24kHz reference
|
||||
.wav). 24kHz mono, 11 languages with Mandarin dialects. Q8_0 (~0.95 GB talker).
|
||||
license: mit
|
||||
icon: https://huggingface.co/avatars/c299494fd1e72375832499c75b3425d6.svg
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- voice-cloning
|
||||
- streaming
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
last_checked: "2026-04-30"
|
||||
last_checked: "2026-06-13"
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp
|
||||
parameters:
|
||||
model: qwen3-tts-cpp
|
||||
model: qwen3-tts-cpp/qwen-talker-0.6b-base-Q8_0.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp/qwen3-tts-0.6b-f16.gguf
|
||||
sha256: 0b89770118463af8f2467d824a8de57d96df6a09f927a9769a3f7b7fffa7087d
|
||||
uri: huggingface://endo5501/qwen3-tts.cpp/qwen3-tts-0.6b-f16.gguf
|
||||
- filename: qwen3-tts-cpp/qwen3-tts-tokenizer-f16.gguf
|
||||
sha256: d1ad9660bd99343f4851d5a4b17e31f65648feb3559f6ea062ae6575e5cd9d90
|
||||
uri: huggingface://endo5501/qwen3-tts.cpp/qwen3-tts-tokenizer-f16.gguf
|
||||
- filename: qwen3-tts-cpp/qwen-talker-0.6b-base-Q8_0.gguf
|
||||
sha256: d54dbaf10591421fa764ed630d764efa717ae40cd959bd48c66d4eb1af226426
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-0.6b-base-Q8_0.gguf
|
||||
- filename: qwen3-tts-cpp/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
sha256: 1883beeed99348fc35e23dd225e9082f93f6f8c109330a33d935baa8acdbfd94
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-0.6b-base-q4
|
||||
description: |
|
||||
Qwen3-TTS 0.6B Base (C++ / GGML, qwentts.cpp), Q4_K_M (~0.6 GB talker).
|
||||
Streaming + voice cloning, 24kHz mono, 11 languages.
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-0.6b-base-q4
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-0.6b-base-q4/qwen-talker-0.6b-base-Q4_K_M.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-0.6b-base-q4/qwen-talker-0.6b-base-Q4_K_M.gguf
|
||||
sha256: 4b468ec7b1f62b90ef4ca316c0aa57deadfd54b2cf9651703ea753cedaf04226
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-0.6b-base-Q4_K_M.gguf
|
||||
- filename: qwen3-tts-cpp-0.6b-base-q4/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
sha256: cf3788b4d50aaa665fb6e57c170396aae03a3555fea52d2b5d0cda902d658039
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-1.7b-base
|
||||
description: |
|
||||
Qwen3-TTS 1.7B Base (C++ / GGML, qwentts.cpp), Q8_0 (~2.0 GB talker).
|
||||
Higher-quality streaming + voice cloning, 24kHz mono, 11 languages.
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-1.7b-base
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-1.7b-base/qwen-talker-1.7b-base-Q8_0.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-1.7b-base/qwen-talker-1.7b-base-Q8_0.gguf
|
||||
sha256: 4b9a33a236908dd9435a42f7a396e38038329d053b704342a6413c08544c4fda
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-1.7b-base-Q8_0.gguf
|
||||
- filename: qwen3-tts-cpp-1.7b-base/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
sha256: 1883beeed99348fc35e23dd225e9082f93f6f8c109330a33d935baa8acdbfd94
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-1.7b-base-q4
|
||||
description: |
|
||||
Qwen3-TTS 1.7B Base (C++ / GGML, qwentts.cpp), Q4_K_M (~1.2 GB talker).
|
||||
Streaming + voice cloning, 24kHz mono, 11 languages.
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-1.7b-base-q4
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-1.7b-base-q4/qwen-talker-1.7b-base-Q4_K_M.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-1.7b-base-q4/qwen-talker-1.7b-base-Q4_K_M.gguf
|
||||
sha256: ea393ebaf2167ea23ce9fc18b093822851358a950d7075cd47ab4f6ce23e887d
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-1.7b-base-Q4_K_M.gguf
|
||||
- filename: qwen3-tts-cpp-1.7b-base-q4/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
sha256: cf3788b4d50aaa665fb6e57c170396aae03a3555fea52d2b5d0cda902d658039
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-customvoice
|
||||
description: |
|
||||
Qwen3-TTS 0.6B CustomVoice (C++ / GGML, qwentts.cpp), Q8_0. Named speakers
|
||||
selected via the `voice` field: serena, vivian, uncle_fu, ryan, aiden,
|
||||
ono_anna, sohee, eric (sichuan dialect), dylan (beijing dialect). Streaming,
|
||||
24kHz mono, 11 languages.
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- named-speakers
|
||||
- streaming
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-customvoice
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-customvoice/qwen-talker-0.6b-customvoice-Q8_0.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-customvoice/qwen-talker-0.6b-customvoice-Q8_0.gguf
|
||||
sha256: 4eb38675c736ed6ac72012846ac8d6ef80e5af8bc05726870f0b3a6569588519
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-0.6b-customvoice-Q8_0.gguf
|
||||
- filename: qwen3-tts-cpp-customvoice/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
sha256: 1883beeed99348fc35e23dd225e9082f93f6f8c109330a33d935baa8acdbfd94
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-customvoice-q4
|
||||
description: |
|
||||
Qwen3-TTS 0.6B CustomVoice (C++ / GGML, qwentts.cpp), Q4_K_M. Named speakers
|
||||
via the `voice` field (serena, vivian, ryan, aiden, eric, dylan, ...).
|
||||
Streaming, 24kHz mono, 11 languages.
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- named-speakers
|
||||
- streaming
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-customvoice-q4
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-customvoice-q4/qwen-talker-0.6b-customvoice-Q4_K_M.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-customvoice-q4/qwen-talker-0.6b-customvoice-Q4_K_M.gguf
|
||||
sha256: b3a7e6613d80f8a703c06267fc1e94d48ce91932ab82ab6e31c50f4ca4868e1e
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-0.6b-customvoice-Q4_K_M.gguf
|
||||
- filename: qwen3-tts-cpp-customvoice-q4/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
sha256: cf3788b4d50aaa665fb6e57c170396aae03a3555fea52d2b5d0cda902d658039
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-1.7b-customvoice
|
||||
description: |
|
||||
Qwen3-TTS 1.7B CustomVoice (C++ / GGML, qwentts.cpp), Q8_0. Named speakers via
|
||||
the `voice` field (serena, vivian, ryan, aiden, eric, dylan, ...). Streaming,
|
||||
24kHz mono, 11 languages.
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- named-speakers
|
||||
- streaming
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-1.7b-customvoice
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-1.7b-customvoice/qwen-talker-1.7b-customvoice-Q8_0.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-1.7b-customvoice/qwen-talker-1.7b-customvoice-Q8_0.gguf
|
||||
sha256: cab2cff67a0a557310febe558dc83076b28ed790e491867eb2751759f4cd89fa
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-1.7b-customvoice-Q8_0.gguf
|
||||
- filename: qwen3-tts-cpp-1.7b-customvoice/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
sha256: 1883beeed99348fc35e23dd225e9082f93f6f8c109330a33d935baa8acdbfd94
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-1.7b-customvoice-q4
|
||||
description: |
|
||||
Qwen3-TTS 1.7B CustomVoice (C++ / GGML, qwentts.cpp), Q4_K_M. Named speakers
|
||||
via the `voice` field. Streaming, 24kHz mono, 11 languages.
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- named-speakers
|
||||
- streaming
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-1.7b-customvoice-q4
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-1.7b-customvoice-q4/qwen-talker-1.7b-customvoice-Q4_K_M.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-1.7b-customvoice-q4/qwen-talker-1.7b-customvoice-Q4_K_M.gguf
|
||||
sha256: cc328834a631bc08bf9f43e62fa23f8a1383d9b429864ce6690cfb172077fc4a
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-1.7b-customvoice-Q4_K_M.gguf
|
||||
- filename: qwen3-tts-cpp-1.7b-customvoice-q4/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
sha256: cf3788b4d50aaa665fb6e57c170396aae03a3555fea52d2b5d0cda902d658039
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-1.7b-voicedesign
|
||||
description: |
|
||||
Qwen3-TTS 1.7B VoiceDesign (C++ / GGML, qwentts.cpp), Q8_0. Synthesises a
|
||||
speaker from a free-text attribute instruction - REQUIRES the OpenAI
|
||||
`instructions` field (e.g. "male, young adult, moderate pitch"); requests
|
||||
without it are rejected. Streaming, 24kHz mono, 11 languages.
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- voice-design
|
||||
- streaming
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-1.7b-voicedesign
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-1.7b-voicedesign/qwen-talker-1.7b-voicedesign-Q8_0.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-1.7b-voicedesign/qwen-talker-1.7b-voicedesign-Q8_0.gguf
|
||||
sha256: 575610ab1ddcca4dca6bd9a64bcd859d93bbad8764f9cab24e1dbc0c51f62276
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-1.7b-voicedesign-Q8_0.gguf
|
||||
- filename: qwen3-tts-cpp-1.7b-voicedesign/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
sha256: 1883beeed99348fc35e23dd225e9082f93f6f8c109330a33d935baa8acdbfd94
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q8_0.gguf
|
||||
- !!merge <<: *qwenttscpp_gallery
|
||||
name: qwen3-tts-cpp-1.7b-voicedesign-q4
|
||||
description: |
|
||||
Qwen3-TTS 1.7B VoiceDesign (C++ / GGML, qwentts.cpp), Q4_K_M. Synthesises a
|
||||
speaker from a free-text attribute instruction - REQUIRES the `instructions`
|
||||
field. Streaming, 24kHz mono, 11 languages.
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- voice-design
|
||||
- streaming
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-1.7b-voicedesign-q4
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-1.7b-voicedesign-q4/qwen-talker-1.7b-voicedesign-Q4_K_M.gguf
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-1.7b-voicedesign-q4/qwen-talker-1.7b-voicedesign-Q4_K_M.gguf
|
||||
sha256: 7605ed0cc5e72059f27468c27f70c070e05d1cc0c7b1c76bfb9cba717a59eee3
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-talker-1.7b-voicedesign-Q4_K_M.gguf
|
||||
- filename: qwen3-tts-cpp-1.7b-voicedesign-q4/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
sha256: cf3788b4d50aaa665fb6e57c170396aae03a3555fea52d2b5d0cda902d658039
|
||||
uri: huggingface://Serveurperso/Qwen3-TTS-GGUF/qwen-tokenizer-12hz-Q4_K_M.gguf
|
||||
- name: omnivoice-cpp
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
@@ -3402,39 +3631,6 @@
|
||||
- filename: omnivoice-cpp-hq/omnivoice-tokenizer-BF16.gguf
|
||||
sha256: c2179e4cf528b19fea22a5be94c34c083877bb5fc28ac0245d2b4299a262dcec
|
||||
uri: huggingface://Serveurperso/OmniVoice-GGUF/omnivoice-tokenizer-BF16.gguf
|
||||
- name: qwen3-tts-cpp-customvoice
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://huggingface.co/endo5501/qwen3-tts.cpp
|
||||
- https://github.com/predict-woo/qwen3-tts.cpp
|
||||
description: |
|
||||
Qwen3-TTS 0.6B Custom Voice (C++ / GGML) — text-to-speech with voice cloning support.
|
||||
Generates 24kHz mono audio with optional reference audio for voice cloning via ECAPA-TDNN speaker embeddings.
|
||||
Supports 10 languages (en, zh, ja, ko, de, fr, es, it, pt, ru).
|
||||
license: apache-2.0
|
||||
icon: https://huggingface.co/avatars/c299494fd1e72375832499c75b3425d6.svg
|
||||
tags:
|
||||
- tts
|
||||
- text-to-speech
|
||||
- voice-cloning
|
||||
- qwen3-tts
|
||||
- qwen3-tts-cpp
|
||||
- gguf
|
||||
last_checked: "2026-04-30"
|
||||
overrides:
|
||||
backend: qwen3-tts-cpp
|
||||
known_usecases:
|
||||
- tts
|
||||
name: qwen3-tts-cpp-customvoice
|
||||
parameters:
|
||||
model: qwen3-tts-cpp-customvoice
|
||||
files:
|
||||
- filename: qwen3-tts-cpp-customvoice/qwen3-tts-0.6b-customvoice-f16.gguf
|
||||
sha256: 40b985b71be0970d41eb042488766db556cf17290aa1cff631cabfa0bd3b0431
|
||||
uri: huggingface://endo5501/qwen3-tts.cpp/qwen3-tts-0.6b-customvoice-f16.gguf
|
||||
- filename: qwen3-tts-cpp-customvoice/qwen3-tts-tokenizer-f16.gguf
|
||||
sha256: d1ad9660bd99343f4851d5a4b17e31f65648feb3559f6ea062ae6575e5cd9d90
|
||||
uri: huggingface://endo5501/qwen3-tts.cpp/qwen3-tts-tokenizer-f16.gguf
|
||||
- name: qwen3-coder-next-mxfp4_moe
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
|
||||
Reference in New Issue
Block a user