Fix NameError for Cache in WrappedMiniMaxAttention

Use string annotation for the Cache type since it only exists in type
stubs, not in the actual mlx_lm package at runtime.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Alex Cheema
2026-02-03 19:15:50 -08:00
parent a54ba12dee
commit cd9f3182d9

View File

@@ -635,7 +635,7 @@ class WrappedMiniMaxAttention(CustomMlxLayer):
self,
x: mx.array,
mask: mx.array | None = None,
cache: Cache | None = None,
cache: "Cache | None" = None,
) -> mx.array:
batch_dim, seq_dim, _ = x.shape