mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-14 03:37:47 -04:00
* feat(omnivoice-cpp): add C wrapper + CMake/Makefile build over OmniVoice ov_* ABI Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): add option/language parsing + WAV framing helpers with tests Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): wire purego binding with TTS + streaming TTSStream Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * build(omnivoice-cpp): wire backend into root Makefile Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(omnivoice-cpp): add build matrix entries + dep-bump registration Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): register backend meta + image entries Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): expose as preference-only importable backend Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): add omnivoice-cpp TTS models (Q8_0 default + BF16 HQ) Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(omnivoice-cpp): document the OmniVoice TTS backend Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(omnivoice-cpp): add env-gated e2e for TTS + streaming Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): honor tts.audio_path/tts.voice config as default cloning reference The model config tts.audio_path (ModelOptions.AudioPath) and tts.voice now provide a default voice-cloning reference used when a request omits Voice, so a cloned voice can be pinned in the model YAML instead of passed per request. A per-request voice still overrides. Paths resolve relative to the model dir. Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(omnivoice-cpp): add missing omnivoice-cpp-development backend meta Mirrors the whisper/vibevoice convention: a -development meta aggregating the master-tagged image variants (the production meta and per-variant prod+dev image entries already existed; only the development meta aggregator was missing). Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
39 lines
1.7 KiB
C++
39 lines
1.7 KiB
C++
#pragma once
|
|
|
|
#include <cstdint>
|
|
|
|
extern "C" {
|
|
|
|
// Streaming PCM chunk callback. samples is mono float PCM at 24 kHz, valid
|
|
// only for the duration of the call. Return non-zero to continue, 0 to abort.
|
|
typedef int (*omni_pcm_chunk_cb)(const float *samples, int n_samples,
|
|
void *user_data);
|
|
|
|
// Load the LM (model_path) + codec (codec_path) GGUFs. use_fa / clamp_fp16
|
|
// map to ov_init_params. Returns 0 on success, non-zero on failure.
|
|
int omni_load(const char *model_path, const char *codec_path, int use_fa,
|
|
int clamp_fp16);
|
|
|
|
// Synthesize to a malloc'd float PCM buffer (caller frees via omni_pcm_free).
|
|
// ref_samples != null && ref_n > 0 => voice cloning (ref_text optional).
|
|
// instruct != null && non-empty => voice design. seed < 0 keeps the default
|
|
// MaskGIT seed. denoise toggles the <|denoise|> marker (only with a reference).
|
|
// Writes the sample count to *out_n. Returns NULL on failure (out_n set to 0).
|
|
float *omni_tts(const char *text, const char *lang, const char *instruct,
|
|
const float *ref_samples, int ref_n, const char *ref_text,
|
|
long long seed, int denoise, int *out_n);
|
|
|
|
// Streaming synthesis: cb is invoked per PCM chunk as audio is produced.
|
|
// Same reference/design/seed semantics as omni_tts. Returns 0 on success.
|
|
int omni_tts_stream(const char *text, const char *lang, const char *instruct,
|
|
const float *ref_samples, int ref_n, const char *ref_text,
|
|
long long seed, int denoise, omni_pcm_chunk_cb cb,
|
|
void *user_data);
|
|
|
|
// Free a buffer returned by omni_tts.
|
|
void omni_pcm_free(float *p);
|
|
|
|
// Release the OmniVoice context.
|
|
void omni_unload(void);
|
|
}
|