fix(nemo): extract Hypothesis.text for TDT/RNNT ASR models (#10012)

* fix(nemo): extract Hypothesis.text for TDT/RNNT ASR models

CTC models (e.g. Whisper) return List[str] from transcribe(), but
TDT/RNNT models (e.g. parakeet-tdt-0.6b-v3) return List[Hypothesis]
where the decoded text lives in the Hypothesis.text attribute.

Previously, results[0] was assigned directly to the protobuf string
field, causing silent empty output for non-CTC models.

Now checks the return type and extracts .text from Hypothesis objects,
with a safe fallback via getattr().

* refactor: simplify Hypothesis text extraction per Copilot review

Use single getattr() call instead of hasattr() + double access,
and return empty string for unknown types instead of str(result)
to avoid leaking internal repr to clients.
This commit is contained in:
番茄摔成番茄酱
2026-05-27 04:35:23 +08:00
committed by GitHub
parent 4e5ec6f67b
commit df7623fd87

View File

@@ -99,8 +99,15 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
if not results or len(results) == 0:
return backend_pb2.TranscriptResult(segments=[], text="")
# Get the transcript text from the first result
text = results[0]
# Get the transcript text from the first result.
# CTC models return List[str], TDT/RNNT models return List[Hypothesis]
# where the actual text lives in Hypothesis.text.
result = results[0]
if isinstance(result, str):
text = result
else:
text = getattr(result, 'text', None) or ""
if text:
# Create a single segment with the full transcription
result_segments.append(backend_pb2.TranscriptSegment(