mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-22 15:49:12 -04:00
* feat(ced): sketch sound-classification backend (CED audio tagger) Wires ced.cpp (CED, 527-class AudioSet sound-event tagger; baby cry, footsteps, glass, alarms, dog bark) into LocalAI as a Go/purego backend. SKETCH (backend skeleton real; core REST wiring + CI/gallery is a checklist in DESIGN.md): - backend/backend.proto: new SoundDetection rpc + SoundClass messages (run `make protogen-go` to regenerate pkg/grpc/proto). - backend/go/ced: main.go (purego dlopen libced.so + ced_capi.h), goced.go (Ced gRPC backend: Load + SoundDetection), Makefile (clone-at-pin CED_VERSION, ggml static-PIC shared build), run.sh, package.sh, .gitignore. - DESIGN.md: REST /v1/audio/classification wiring (handler/route/capability registration checklist), gallery/index + CI registration, and a scoping note for the realtime/websocket live-recognition path (sliding-window classify over the existing ws transport + voicegate; the ced C-API per-PCM entry point is already window-friendly). Backend code does not compile until protogen-go regenerates the pb types and a libced.so is built (Makefile clones+builds it). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): REST /v1/audio/classification endpoint + capability registration Wires the ced sound-event classification backend (AudioSet audio tagger) end to end through the REST surface, mirroring the transcription path. - Handler: core/http/endpoints/openai/sound_classification.go parses the multipart audio upload, temp-files it, resolves the model config and calls the SoundDetection RPC; returns {model, detections[]} JSON. - Backend wrapper: core/backend/sound_classification.go (ModelSoundDetection) loads the model and normalizes the proto response into schema types. - Schema: core/schema/sound_classification.go (SoundClassificationResult). - gRPC layer: SoundDetection wired through the LocalAI wrapper (interface, Backend client, Client, embed, server, base default) so the loader-typed client exposes the RPC; proto regenerated via make protogen-go. - Route: POST /v1/audio/classification (+ /audio/classification alias) with the audio/multipart default-model middleware in routes/openai.go. - Capability surfaces: swagger @Tags/@Router on the handler; FLAG_SOUND_ CLASSIFICATION usecase flag + UsecaseSoundClassification + UsecaseInfoMap + GuessUsecases + ModalityGroups + GetAllModelConfigUsecases; meta usecase option; /api/instructions audio area updated; auth RouteFeatureRegistry + FeatureAudioClassification (APIFeatures, default ON) + FeatureMetas; UI usecaseFilters, capabilities.js CAP_SOUND_CLASSIFICATION, Models.jsx filter + i18n; docs page features/audio-classification.md + whats-new + crosslink. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): realtime sound-event detection over the websocket API When a realtime pipeline configures a sound-classification model, each VAD-committed utterance (the same window the transcription path produces) is also run through the CED sound-event classifier and the scored AudioSet tags are emitted as a new server event. No new backend rpc is needed: the SoundDetection gRPC method already exists on this branch. - config: add Pipeline.SoundDetection (yaml/json sound_detection,omitempty) beside Transcription/VAD. - realtime: add Model.SoundDetection(ctx, audio, topK, threshold) to the ModelInterface; implement it on wrappedModel and transcriptOnlyModel by calling backend.ModelSoundDetection with the session's sound-classification model config (mirrors how Transcribe dispatches). Load the optional config in newModel / newTranscriptionOnlyModel; nil config keeps it additive. - types: add ConversationItemSoundDetectionEvent (item_id, content_index, detections[]{label,score,index}) with type conversation.item.sound_detection, its ServerEventType constant and MarshalJSON, mirroring the transcription completed event. - realtime: add emitSoundDetection (unary path: classify the committed window, build the event, t.SendEvent) and wire it at the utterance-commit hook right after emitTranscription; gated on session.SoundDetectionEnabled (resolved from Pipeline.SoundDetection at session setup, defaults top_k=5, threshold=0). Its error is logged via xlog but never aborts the turn. - test: Ginkgo specs for emitSoundDetection (tags emitted, empty detections, classifier error) plus a SoundDetection method on the fakeModel double. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ced): implement SoundDetection in nodes backend test doubles The SoundDetection method added to the grpc backend interface left two test doubles (fakeBackendClient, fakeGRPCBackend) incomplete, so core/services/nodes failed to compile under `go vet`/`go test` (go build missed it: the doubles live in _test.go). Add the method to both, mirroring their existing Detect mock. Repairs CI for the nodes package. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): decouple realtime sound detection from VAD (sound-only sessions) Sound-event detection must activate on sounds, not speech, so it no longer runs through the voice VAD/transcription path. A sound-detection-only pipeline (sound_detection set, no transcription/LLM) now: - is accepted by prepareRealtimeConfig (sound_detection counts as a pipeline stage), - builds a lightweight model via newSoundDetectionOnlyModel (no VAD/STT/LLM/TTS loaded), and - defaults the session to turn_detection none (no VAD) with no transcription stage, so the client drives windowing via input_audio_buffer.commit (option A: client-side sliding window). The per-PCM C-API already supports arbitrary windows. commitUtterance gains a sound-only branch: it emits the conversation.item.sound_detection event (scored AudioSet tags) and stops - no transcription, no LLM response. generateResponse is now guarded on a transcription stage being present, so a sound-only turn never invokes the LLM. Existing transcription/VAD sessions are unchanged (additive). Added a commitUtterance sound-only Ginkgo spec asserting it emits the sound event and neither transcribes nor generates a response. go vet + golangci-lint (new-from-merge-base) clean; openai suite green. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): register sound-classification backend in gallery + CI Mechanical backend-image registration for the ced sound-event classifier, mirroring the parakeet-cpp Go/purego backend everywhere it is wired up. - .github/backend-matrix.yml: add the ced build matrix, field-for-field copies of the parakeet-cpp entries (cpu amd64/arm64, cublas cuda 12/13 amd64, l4t cuda-13 arm64, l4t-jetpack cuda-12 arm64, sycl f32/f16, vulkan amd64/arm64, rocm hipblas, and the metal darwin entry), changing only backend and tag-suffix. dockerfile stays ./backend/Dockerfile.golang. - backend/index.yaml: add the &ced meta anchor (capabilities map per platform) plus ced-development and the per-arch image entries, each uri/mirror tag-suffix matching the matrix exactly. The model gallery (GGUF) entry is intentionally deferred pending the HuggingFace publish (TODO note inline). - scripts/changed-backends.js: add an explicit item.backend === "ced" branch in inferBackendPath mapping to backend/go/ced/, same mechanism and ordering as the parakeet-cpp branch (before the generic golang fallthrough). - .github/workflows/bump_deps.yaml: register mudler/ced.cpp -> CED_VERSION in backend/go/ced/Makefile so the daily bot bumps the pin. - swagger/{docs.go,swagger.json,swagger.yaml}: regenerated via make swagger so the existing /v1/audio/classification annotations land in the generated spec. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): server-side windowing for realtime sound detection (option B) Adds an optional server-driven sliding-window classifier so a sound-only realtime client only has to stream audio (no input_audio_buffer.commit): - Pipeline.sound_detection_window_ms / sound_detection_hop_ms config knobs. When both > 0 on a sound-only session, the server classifies the last window of streamed audio every hop and emits a conversation.item.sound_ detection event; the input buffer is trimmed to one window so a long stream stays bounded. When unset, the session stays client-driven (option A). Runs independent of VAD (sound events are not speech). - handleSoundWindow (ticker) + classifySoundWindow (one tick, extracted so it is unit-testable) + writeWindowWAV, which declares the true InputSampleRate (NewWAVHeaderWithRate) so the classifier resamples correctly. Goroutine is started after toggleVAD and torn down with the session (close + wg.Wait). - Register pipeline.sound_detection (+window_ms/hop_ms) in the config meta registry; the earlier realtime commit added pipeline.sound_detection without a registry entry, failing TestAllFieldsHaveRegistryEntries. This fixes that and covers the two new knobs. Tests: classifySoundWindow emits an event + trims the buffer to one window, no-ops on too-little audio; writeWindowWAV declares the given sample rate. go build/vet + golangci-lint (new-from-merge-base) clean; config + openai suites green. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): add ced-base GGUF model gallery entries (f16 + q8_0) The ced-base weights are now published at mudler/ced-base-gguf (Apache-2.0, converted from mispeech/ced-base). Adds gallery/ced.yaml (backend: ced + known_usecases: sound_classification) and two gallery/index.yaml entries (ced-base-f16 default, ced-base-q8 smallest) with sha256-pinned files, and removes the now-resolved TODO from backend/index.yaml. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): add tiny/mini/small GGUF model gallery entries Publishes the rest of the CED family (same architecture, metadata-driven port verified end-to-end on ced-tiny) to mudler/ced-{tiny,mini,small}-gguf and adds their f16 + q8_0 gallery entries: ced-tiny (5.5M, edge/Pi-class) f16 11MB / q8_0 6MB ced-mini (9.6M) f16 19MB / q8_0 11MB ced-small (22M) f16 42MB / q8_0 23MB All sha256-pinned. ced-base remains the accuracy default. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ced): point gallery entries at the consolidated mudler/ced-gguf repo All CED quantizations (tiny/mini/small/base, f16/q8_0) now live in a single HuggingFace repo, mudler/ced-gguf, instead of per-model repos. Repoint the 8 gallery model entries' urls + file uris accordingly. sha256 and filenames are unchanged. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ced): bump CED_VERSION to the short-clip fix Pin the ced backend to ced.cpp 99c6ed3, which fixes a crash on any clip shorter than target_length (~10.11s): time_pos_embed was added at its full 63-frame grid instead of being sliced to the clip's actual time grid, tripping ggml_can_repeat in ggml_add. Surfaced by the live realtime e2e (sub-10s windows) and gated with a short-clip parity test upstream. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(ced): list ced.cpp as a LocalAI-team engine + backend-guide directive - README.md: add ced.cpp to the "native C/C++/GGML engines developed and maintained by the LocalAI project" table. - docs/content/features/backends.md: add a Sound Classification backend category (sound-event classification / audio tagging) listing ced.cpp. - .agents/adding-backends.md: add a "Documenting the backend" section and two verification-checklist items requiring new backends to be documented in the backends.md category list, and in-house native engines to be added to the README maintained-engines table. This directive was missing. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ced): repin CED_VERSION to the v0.1.0 release commit ced.cpp history was squashed into a single release commit (tagged v0.1.0), so the previous pin (99c6ed3) no longer exists upstream. Pin to c04ac14, the v0.1.0 release commit, so the backend builds against a commit that exists. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ced): silence gosec G304/G103 + govet unsafeptr on audited paths - sound_classification.go: os.Create(dst) where dst = temp dir + path.Base of the upload (no traversal). #nosec G304, matching the depth-anything-cpp handler. - goced.go: reading a NUL-terminated C string from a libced-owned buffer. #nosec G103 (gosec) + //nolint:govet (golangci-lint's unsafeptr check), since the uintptr is a C-owned malloc'd buffer, not Go-GC memory. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
733 lines
21 KiB
Go
733 lines
21 KiB
Go
package grpc
|
|
|
|
import (
|
|
"context"
|
|
"io"
|
|
"sync"
|
|
|
|
pb "github.com/mudler/LocalAI/pkg/grpc/proto"
|
|
"google.golang.org/grpc"
|
|
"google.golang.org/grpc/metadata"
|
|
)
|
|
|
|
var _ Backend = new(embedBackend)
|
|
var _ pb.Backend_PredictStreamServer = new(embedBackendServerStream)
|
|
|
|
type embedBackend struct {
|
|
s *server
|
|
}
|
|
|
|
func (e *embedBackend) IsBusy() bool {
|
|
return e.s.llm.Busy()
|
|
}
|
|
|
|
func (e *embedBackend) HealthCheck(ctx context.Context) (bool, error) {
|
|
return true, nil
|
|
}
|
|
|
|
func (e *embedBackend) Embeddings(ctx context.Context, in *pb.PredictOptions, opts ...grpc.CallOption) (*pb.EmbeddingResult, error) {
|
|
return e.s.Embedding(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Predict(ctx context.Context, in *pb.PredictOptions, opts ...grpc.CallOption) (*pb.Reply, error) {
|
|
return e.s.Predict(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) LoadModel(ctx context.Context, in *pb.ModelOptions, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.LoadModel(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) PredictStream(ctx context.Context, in *pb.PredictOptions, f func(reply *pb.Reply), opts ...grpc.CallOption) error {
|
|
bs := &embedBackendServerStream{
|
|
ctx: ctx,
|
|
fn: f,
|
|
}
|
|
return e.s.PredictStream(in, bs)
|
|
}
|
|
|
|
func (e *embedBackend) GenerateImage(ctx context.Context, in *pb.GenerateImageRequest, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.GenerateImage(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) GenerateVideo(ctx context.Context, in *pb.GenerateVideoRequest, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.GenerateVideo(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) TTS(ctx context.Context, in *pb.TTSRequest, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.TTS(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) TTSStream(ctx context.Context, in *pb.TTSRequest, f func(reply *pb.Reply), opts ...grpc.CallOption) error {
|
|
bs := &embedBackendServerStream{
|
|
ctx: ctx,
|
|
fn: f,
|
|
}
|
|
return e.s.TTSStream(in, bs)
|
|
}
|
|
|
|
func (e *embedBackend) SoundGeneration(ctx context.Context, in *pb.SoundGenerationRequest, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.SoundGeneration(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Detect(ctx context.Context, in *pb.DetectOptions, opts ...grpc.CallOption) (*pb.DetectResponse, error) {
|
|
return e.s.Detect(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Depth(ctx context.Context, in *pb.DepthRequest, opts ...grpc.CallOption) (*pb.DepthResponse, error) {
|
|
return e.s.Depth(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) FaceVerify(ctx context.Context, in *pb.FaceVerifyRequest, opts ...grpc.CallOption) (*pb.FaceVerifyResponse, error) {
|
|
return e.s.FaceVerify(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) FaceAnalyze(ctx context.Context, in *pb.FaceAnalyzeRequest, opts ...grpc.CallOption) (*pb.FaceAnalyzeResponse, error) {
|
|
return e.s.FaceAnalyze(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) VoiceVerify(ctx context.Context, in *pb.VoiceVerifyRequest, opts ...grpc.CallOption) (*pb.VoiceVerifyResponse, error) {
|
|
return e.s.VoiceVerify(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) VoiceAnalyze(ctx context.Context, in *pb.VoiceAnalyzeRequest, opts ...grpc.CallOption) (*pb.VoiceAnalyzeResponse, error) {
|
|
return e.s.VoiceAnalyze(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) VoiceEmbed(ctx context.Context, in *pb.VoiceEmbedRequest, opts ...grpc.CallOption) (*pb.VoiceEmbedResponse, error) {
|
|
return e.s.VoiceEmbed(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) AudioTranscription(ctx context.Context, in *pb.TranscriptRequest, opts ...grpc.CallOption) (*pb.TranscriptResult, error) {
|
|
return e.s.AudioTranscription(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) AudioTranscriptionStream(ctx context.Context, in *pb.TranscriptRequest, f func(chunk *pb.TranscriptStreamResponse), opts ...grpc.CallOption) error {
|
|
bs := &embedBackendAudioTranscriptionStream{
|
|
ctx: ctx,
|
|
fn: f,
|
|
}
|
|
return e.s.AudioTranscriptionStream(in, bs)
|
|
}
|
|
|
|
func (e *embedBackend) TokenizeString(ctx context.Context, in *pb.PredictOptions, opts ...grpc.CallOption) (*pb.TokenizationResponse, error) {
|
|
return e.s.TokenizeString(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Status(ctx context.Context) (*pb.StatusResponse, error) {
|
|
return e.s.Status(ctx, &pb.HealthMessage{})
|
|
}
|
|
|
|
func (e *embedBackend) StoresSet(ctx context.Context, in *pb.StoresSetOptions, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.StoresSet(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) StoresDelete(ctx context.Context, in *pb.StoresDeleteOptions, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.StoresDelete(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) StoresGet(ctx context.Context, in *pb.StoresGetOptions, opts ...grpc.CallOption) (*pb.StoresGetResult, error) {
|
|
return e.s.StoresGet(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) StoresFind(ctx context.Context, in *pb.StoresFindOptions, opts ...grpc.CallOption) (*pb.StoresFindResult, error) {
|
|
return e.s.StoresFind(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Rerank(ctx context.Context, in *pb.RerankRequest, opts ...grpc.CallOption) (*pb.RerankResult, error) {
|
|
return e.s.Rerank(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) TokenClassify(ctx context.Context, in *pb.TokenClassifyRequest, opts ...grpc.CallOption) (*pb.TokenClassifyResponse, error) {
|
|
return e.s.TokenClassify(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Score(ctx context.Context, in *pb.ScoreRequest, opts ...grpc.CallOption) (*pb.ScoreResponse, error) {
|
|
return e.s.Score(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) VAD(ctx context.Context, in *pb.VADRequest, opts ...grpc.CallOption) (*pb.VADResponse, error) {
|
|
return e.s.VAD(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Diarize(ctx context.Context, in *pb.DiarizeRequest, opts ...grpc.CallOption) (*pb.DiarizeResponse, error) {
|
|
return e.s.Diarize(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) SoundDetection(ctx context.Context, in *pb.SoundDetectionRequest, opts ...grpc.CallOption) (*pb.SoundDetectionResponse, error) {
|
|
return e.s.SoundDetection(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) AudioEncode(ctx context.Context, in *pb.AudioEncodeRequest, opts ...grpc.CallOption) (*pb.AudioEncodeResult, error) {
|
|
return e.s.AudioEncode(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) AudioDecode(ctx context.Context, in *pb.AudioDecodeRequest, opts ...grpc.CallOption) (*pb.AudioDecodeResult, error) {
|
|
return e.s.AudioDecode(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) AudioTransform(ctx context.Context, in *pb.AudioTransformRequest, opts ...grpc.CallOption) (*pb.AudioTransformResult, error) {
|
|
return e.s.AudioTransform(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) AudioTransformStream(ctx context.Context, opts ...grpc.CallOption) (AudioTransformStreamClient, error) {
|
|
// In-process bidi stream is two channels paired with two facades:
|
|
// the server side reads requests / writes responses; the client side
|
|
// is its mirror.
|
|
reqs := make(chan *pb.AudioTransformFrameRequest, 4)
|
|
resps := make(chan *pb.AudioTransformFrameResponse, 4)
|
|
srvDone := make(chan error, 1)
|
|
|
|
server := &embedBackendAudioTransformStream{
|
|
ctx: ctx,
|
|
reqs: reqs,
|
|
resps: resps,
|
|
}
|
|
|
|
go func() {
|
|
err := e.s.AudioTransformStream(server)
|
|
// Backend has finished — no more responses will arrive.
|
|
close(resps)
|
|
srvDone <- err
|
|
}()
|
|
|
|
return &embedBackendAudioTransformStreamClient{
|
|
ctx: ctx,
|
|
reqs: reqs,
|
|
resps: resps,
|
|
srvDone: srvDone,
|
|
}, nil
|
|
}
|
|
|
|
func (e *embedBackend) Forward(ctx context.Context, opts ...grpc.CallOption) (ForwardClient, error) {
|
|
reqs := make(chan *pb.ForwardRequest, 8)
|
|
resps := make(chan *pb.ForwardReply, 8)
|
|
srvDone := make(chan error, 1)
|
|
|
|
server := &embedBackendForwardStream{ctx: ctx, reqs: reqs, resps: resps}
|
|
|
|
go func() {
|
|
err := e.s.Forward(server)
|
|
close(resps)
|
|
srvDone <- err
|
|
}()
|
|
|
|
return &embedBackendForwardStreamClient{
|
|
ctx: ctx,
|
|
reqs: reqs,
|
|
resps: resps,
|
|
srvDone: srvDone,
|
|
}, nil
|
|
}
|
|
|
|
func (e *embedBackend) AudioToAudioStream(ctx context.Context, opts ...grpc.CallOption) (AudioToAudioStreamClient, error) {
|
|
reqs := make(chan *pb.AudioToAudioRequest, 8)
|
|
resps := make(chan *pb.AudioToAudioResponse, 8)
|
|
srvDone := make(chan error, 1)
|
|
|
|
server := &embedBackendAudioToAudioStream{
|
|
ctx: ctx,
|
|
reqs: reqs,
|
|
resps: resps,
|
|
}
|
|
|
|
go func() {
|
|
err := e.s.AudioToAudioStream(server)
|
|
close(resps)
|
|
srvDone <- err
|
|
}()
|
|
|
|
return &embedBackendAudioToAudioStreamClient{
|
|
ctx: ctx,
|
|
reqs: reqs,
|
|
resps: resps,
|
|
srvDone: srvDone,
|
|
}, nil
|
|
}
|
|
|
|
func (e *embedBackend) ModelMetadata(ctx context.Context, in *pb.ModelOptions, opts ...grpc.CallOption) (*pb.ModelMetadataResponse, error) {
|
|
return e.s.ModelMetadata(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) GetTokenMetrics(ctx context.Context, in *pb.MetricsRequest, opts ...grpc.CallOption) (*pb.MetricsResponse, error) {
|
|
return e.s.GetMetrics(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) StartFineTune(ctx context.Context, in *pb.FineTuneRequest, opts ...grpc.CallOption) (*pb.FineTuneJobResult, error) {
|
|
return e.s.StartFineTune(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) FineTuneProgress(ctx context.Context, in *pb.FineTuneProgressRequest, f func(update *pb.FineTuneProgressUpdate), opts ...grpc.CallOption) error {
|
|
bs := &embedBackendFineTuneProgressStream{
|
|
ctx: ctx,
|
|
fn: f,
|
|
}
|
|
return e.s.FineTuneProgress(in, bs)
|
|
}
|
|
|
|
func (e *embedBackend) StopFineTune(ctx context.Context, in *pb.FineTuneStopRequest, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.StopFineTune(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) ListCheckpoints(ctx context.Context, in *pb.ListCheckpointsRequest, opts ...grpc.CallOption) (*pb.ListCheckpointsResponse, error) {
|
|
return e.s.ListCheckpoints(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) ExportModel(ctx context.Context, in *pb.ExportModelRequest, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.ExportModel(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) StartQuantization(ctx context.Context, in *pb.QuantizationRequest, opts ...grpc.CallOption) (*pb.QuantizationJobResult, error) {
|
|
return e.s.StartQuantization(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) QuantizationProgress(ctx context.Context, in *pb.QuantizationProgressRequest, f func(update *pb.QuantizationProgressUpdate), opts ...grpc.CallOption) error {
|
|
bs := &embedBackendQuantizationProgressStream{
|
|
ctx: ctx,
|
|
fn: f,
|
|
}
|
|
return e.s.QuantizationProgress(in, bs)
|
|
}
|
|
|
|
func (e *embedBackend) StopQuantization(ctx context.Context, in *pb.QuantizationStopRequest, opts ...grpc.CallOption) (*pb.Result, error) {
|
|
return e.s.StopQuantization(ctx, in)
|
|
}
|
|
|
|
func (e *embedBackend) Free(ctx context.Context) error {
|
|
_, err := e.s.Free(ctx, &pb.HealthMessage{})
|
|
return err
|
|
}
|
|
|
|
var _ pb.Backend_AudioTransformStreamServer = new(embedBackendAudioTransformStream)
|
|
var _ AudioTransformStreamClient = new(embedBackendAudioTransformStreamClient)
|
|
var _ pb.Backend_AudioToAudioStreamServer = new(embedBackendAudioToAudioStream)
|
|
var _ AudioToAudioStreamClient = new(embedBackendAudioToAudioStreamClient)
|
|
|
|
// embedBackendAudioTransformStream is the server side of an in-process bidi
|
|
// stream. The hosted server reads requests from `reqs` (closed by client when
|
|
// done sending) and writes responses to `resps`.
|
|
type embedBackendAudioTransformStream struct {
|
|
ctx context.Context
|
|
reqs <-chan *pb.AudioTransformFrameRequest
|
|
resps chan<- *pb.AudioTransformFrameResponse
|
|
}
|
|
|
|
func (e *embedBackendAudioTransformStream) Send(resp *pb.AudioTransformFrameResponse) error {
|
|
select {
|
|
case e.resps <- resp:
|
|
return nil
|
|
case <-e.ctx.Done():
|
|
return e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioTransformStream) Recv() (*pb.AudioTransformFrameRequest, error) {
|
|
select {
|
|
case req, ok := <-e.reqs:
|
|
if !ok {
|
|
return nil, io.EOF
|
|
}
|
|
return req, nil
|
|
case <-e.ctx.Done():
|
|
return nil, e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioTransformStream) SetHeader(md metadata.MD) error { return nil }
|
|
func (e *embedBackendAudioTransformStream) SendHeader(md metadata.MD) error { return nil }
|
|
func (e *embedBackendAudioTransformStream) SetTrailer(md metadata.MD) {}
|
|
func (e *embedBackendAudioTransformStream) Context() context.Context { return e.ctx }
|
|
func (e *embedBackendAudioTransformStream) SendMsg(m any) error {
|
|
if x, ok := m.(*pb.AudioTransformFrameResponse); ok {
|
|
return e.Send(x)
|
|
}
|
|
return nil
|
|
}
|
|
func (e *embedBackendAudioTransformStream) RecvMsg(m any) error {
|
|
// gRPC bidi streaming uses Recv() directly; RecvMsg is unused on this path.
|
|
return nil
|
|
}
|
|
|
|
// embedBackendAudioTransformStreamClient is the caller-facing side. It
|
|
// mirrors the server-side stream over the same channels.
|
|
type embedBackendAudioTransformStreamClient struct {
|
|
ctx context.Context
|
|
reqs chan<- *pb.AudioTransformFrameRequest
|
|
resps <-chan *pb.AudioTransformFrameResponse
|
|
srvDone <-chan error
|
|
closeOnce bool
|
|
}
|
|
|
|
func (e *embedBackendAudioTransformStreamClient) Send(req *pb.AudioTransformFrameRequest) error {
|
|
select {
|
|
case e.reqs <- req:
|
|
return nil
|
|
case <-e.ctx.Done():
|
|
return e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioTransformStreamClient) Recv() (*pb.AudioTransformFrameResponse, error) {
|
|
select {
|
|
case resp, ok := <-e.resps:
|
|
if !ok {
|
|
// Server-side finished. Surface its terminal error if any.
|
|
select {
|
|
case err := <-e.srvDone:
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
default:
|
|
}
|
|
return nil, io.EOF
|
|
}
|
|
return resp, nil
|
|
case <-e.ctx.Done():
|
|
return nil, e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioTransformStreamClient) CloseSend() error {
|
|
if e.closeOnce {
|
|
return nil
|
|
}
|
|
e.closeOnce = true
|
|
close(e.reqs)
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendAudioTransformStreamClient) Context() context.Context { return e.ctx }
|
|
|
|
// embedBackendAudioToAudioStream is the in-process server-side handle for
|
|
// the bidirectional any-to-any audio RPC. Mirrors embedBackendAudioTransform
|
|
// Stream — the hosted server reads requests from `reqs` (closed by client
|
|
// when done sending) and writes responses to `resps`.
|
|
type embedBackendAudioToAudioStream struct {
|
|
ctx context.Context
|
|
reqs <-chan *pb.AudioToAudioRequest
|
|
resps chan<- *pb.AudioToAudioResponse
|
|
}
|
|
|
|
func (e *embedBackendAudioToAudioStream) Send(resp *pb.AudioToAudioResponse) error {
|
|
select {
|
|
case e.resps <- resp:
|
|
return nil
|
|
case <-e.ctx.Done():
|
|
return e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioToAudioStream) Recv() (*pb.AudioToAudioRequest, error) {
|
|
select {
|
|
case req, ok := <-e.reqs:
|
|
if !ok {
|
|
return nil, io.EOF
|
|
}
|
|
return req, nil
|
|
case <-e.ctx.Done():
|
|
return nil, e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioToAudioStream) SetHeader(md metadata.MD) error { return nil }
|
|
func (e *embedBackendAudioToAudioStream) SendHeader(md metadata.MD) error { return nil }
|
|
func (e *embedBackendAudioToAudioStream) SetTrailer(md metadata.MD) {}
|
|
func (e *embedBackendAudioToAudioStream) Context() context.Context { return e.ctx }
|
|
func (e *embedBackendAudioToAudioStream) SendMsg(m any) error {
|
|
if x, ok := m.(*pb.AudioToAudioResponse); ok {
|
|
return e.Send(x)
|
|
}
|
|
return nil
|
|
}
|
|
func (e *embedBackendAudioToAudioStream) RecvMsg(m any) error { return nil }
|
|
|
|
type embedBackendAudioToAudioStreamClient struct {
|
|
ctx context.Context
|
|
reqs chan<- *pb.AudioToAudioRequest
|
|
resps <-chan *pb.AudioToAudioResponse
|
|
srvDone <-chan error
|
|
closeOnce bool
|
|
}
|
|
|
|
func (e *embedBackendAudioToAudioStreamClient) Send(req *pb.AudioToAudioRequest) error {
|
|
select {
|
|
case e.reqs <- req:
|
|
return nil
|
|
case <-e.ctx.Done():
|
|
return e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioToAudioStreamClient) Recv() (*pb.AudioToAudioResponse, error) {
|
|
select {
|
|
case resp, ok := <-e.resps:
|
|
if !ok {
|
|
// Server goroutine writes to srvDone immediately after closing
|
|
// resps; block (cap with ctx) so we don't race past a real error.
|
|
select {
|
|
case err := <-e.srvDone:
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
case <-e.ctx.Done():
|
|
return nil, e.ctx.Err()
|
|
}
|
|
return nil, io.EOF
|
|
}
|
|
return resp, nil
|
|
case <-e.ctx.Done():
|
|
return nil, e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendAudioToAudioStreamClient) CloseSend() error {
|
|
if e.closeOnce {
|
|
return nil
|
|
}
|
|
e.closeOnce = true
|
|
close(e.reqs)
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendAudioToAudioStreamClient) Context() context.Context { return e.ctx }
|
|
|
|
var _ pb.Backend_AudioTranscriptionStreamServer = new(embedBackendAudioTranscriptionStream)
|
|
|
|
type embedBackendAudioTranscriptionStream struct {
|
|
ctx context.Context
|
|
fn func(chunk *pb.TranscriptStreamResponse)
|
|
}
|
|
|
|
func (e *embedBackendAudioTranscriptionStream) Send(chunk *pb.TranscriptStreamResponse) error {
|
|
e.fn(chunk)
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendAudioTranscriptionStream) SetHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendAudioTranscriptionStream) SendHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendAudioTranscriptionStream) SetTrailer(md metadata.MD) {
|
|
}
|
|
|
|
func (e *embedBackendAudioTranscriptionStream) Context() context.Context {
|
|
return e.ctx
|
|
}
|
|
|
|
func (e *embedBackendAudioTranscriptionStream) SendMsg(m any) error {
|
|
if x, ok := m.(*pb.TranscriptStreamResponse); ok {
|
|
return e.Send(x)
|
|
}
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendAudioTranscriptionStream) RecvMsg(m any) error {
|
|
return nil
|
|
}
|
|
|
|
var _ pb.Backend_FineTuneProgressServer = new(embedBackendFineTuneProgressStream)
|
|
|
|
type embedBackendFineTuneProgressStream struct {
|
|
ctx context.Context
|
|
fn func(update *pb.FineTuneProgressUpdate)
|
|
}
|
|
|
|
func (e *embedBackendFineTuneProgressStream) Send(update *pb.FineTuneProgressUpdate) error {
|
|
e.fn(update)
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendFineTuneProgressStream) SetHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendFineTuneProgressStream) SendHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendFineTuneProgressStream) SetTrailer(md metadata.MD) {
|
|
}
|
|
|
|
func (e *embedBackendFineTuneProgressStream) Context() context.Context {
|
|
return e.ctx
|
|
}
|
|
|
|
func (e *embedBackendFineTuneProgressStream) SendMsg(m any) error {
|
|
if x, ok := m.(*pb.FineTuneProgressUpdate); ok {
|
|
return e.Send(x)
|
|
}
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendFineTuneProgressStream) RecvMsg(m any) error {
|
|
return nil
|
|
}
|
|
|
|
var _ pb.Backend_QuantizationProgressServer = new(embedBackendQuantizationProgressStream)
|
|
|
|
type embedBackendQuantizationProgressStream struct {
|
|
ctx context.Context
|
|
fn func(update *pb.QuantizationProgressUpdate)
|
|
}
|
|
|
|
func (e *embedBackendQuantizationProgressStream) Send(update *pb.QuantizationProgressUpdate) error {
|
|
e.fn(update)
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendQuantizationProgressStream) SetHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendQuantizationProgressStream) SendHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendQuantizationProgressStream) SetTrailer(md metadata.MD) {
|
|
}
|
|
|
|
func (e *embedBackendQuantizationProgressStream) Context() context.Context {
|
|
return e.ctx
|
|
}
|
|
|
|
func (e *embedBackendQuantizationProgressStream) SendMsg(m any) error {
|
|
if x, ok := m.(*pb.QuantizationProgressUpdate); ok {
|
|
return e.Send(x)
|
|
}
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendQuantizationProgressStream) RecvMsg(m any) error {
|
|
return nil
|
|
}
|
|
|
|
type embedBackendServerStream struct {
|
|
ctx context.Context
|
|
fn func(reply *pb.Reply)
|
|
}
|
|
|
|
func (e *embedBackendServerStream) Send(reply *pb.Reply) error {
|
|
e.fn(reply)
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendServerStream) SetHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendServerStream) SendHeader(md metadata.MD) error {
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendServerStream) SetTrailer(md metadata.MD) {
|
|
}
|
|
|
|
func (e *embedBackendServerStream) Context() context.Context {
|
|
return e.ctx
|
|
}
|
|
|
|
func (e *embedBackendServerStream) SendMsg(m any) error {
|
|
if x, ok := m.(*pb.Reply); ok {
|
|
return e.Send(x)
|
|
}
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendServerStream) RecvMsg(m any) error {
|
|
return nil
|
|
}
|
|
|
|
var _ pb.Backend_ForwardServer = new(embedBackendForwardStream)
|
|
var _ ForwardClient = new(embedBackendForwardStreamClient)
|
|
|
|
// embedBackendForwardStream is the server-side handle for an in-process
|
|
// Forward bidi stream. The hosted backend reads requests from `reqs`
|
|
// (closed by the client when done sending) and writes replies to
|
|
// `resps`.
|
|
type embedBackendForwardStream struct {
|
|
ctx context.Context
|
|
reqs <-chan *pb.ForwardRequest
|
|
resps chan<- *pb.ForwardReply
|
|
}
|
|
|
|
func (e *embedBackendForwardStream) Send(resp *pb.ForwardReply) error {
|
|
select {
|
|
case e.resps <- resp:
|
|
return nil
|
|
case <-e.ctx.Done():
|
|
return e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendForwardStream) Recv() (*pb.ForwardRequest, error) {
|
|
select {
|
|
case req, ok := <-e.reqs:
|
|
if !ok {
|
|
return nil, io.EOF
|
|
}
|
|
return req, nil
|
|
case <-e.ctx.Done():
|
|
return nil, e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendForwardStream) SetHeader(md metadata.MD) error { return nil }
|
|
func (e *embedBackendForwardStream) SendHeader(md metadata.MD) error { return nil }
|
|
func (e *embedBackendForwardStream) SetTrailer(md metadata.MD) {}
|
|
func (e *embedBackendForwardStream) Context() context.Context { return e.ctx }
|
|
func (e *embedBackendForwardStream) SendMsg(m any) error {
|
|
if x, ok := m.(*pb.ForwardReply); ok {
|
|
return e.Send(x)
|
|
}
|
|
return nil
|
|
}
|
|
func (e *embedBackendForwardStream) RecvMsg(m any) error { return nil }
|
|
|
|
// embedBackendForwardStreamClient is the caller-facing side. Mirrors
|
|
// the server-side stream over the same channels.
|
|
type embedBackendForwardStreamClient struct {
|
|
ctx context.Context
|
|
reqs chan<- *pb.ForwardRequest
|
|
resps <-chan *pb.ForwardReply
|
|
srvDone <-chan error
|
|
once sync.Once
|
|
}
|
|
|
|
func (e *embedBackendForwardStreamClient) Send(req *pb.ForwardRequest) error {
|
|
select {
|
|
case e.reqs <- req:
|
|
return nil
|
|
case <-e.ctx.Done():
|
|
return e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendForwardStreamClient) Recv() (*pb.ForwardReply, error) {
|
|
select {
|
|
case resp, ok := <-e.resps:
|
|
if !ok {
|
|
select {
|
|
case err := <-e.srvDone:
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
default:
|
|
}
|
|
return nil, io.EOF
|
|
}
|
|
return resp, nil
|
|
case <-e.ctx.Done():
|
|
return nil, e.ctx.Err()
|
|
}
|
|
}
|
|
|
|
func (e *embedBackendForwardStreamClient) CloseSend() error {
|
|
e.once.Do(func() { close(e.reqs) })
|
|
return nil
|
|
}
|
|
|
|
func (e *embedBackendForwardStreamClient) Context() context.Context { return e.ctx }
|