Files
ollama/model
jmorganca c330ea33ed qwen3next: handle mixed recurrent batches
Allow mixed token-count batches by tracking per-seq indices

and falling back to per-seq recurrent processing when layouts

differ.

Add per-slot conv/delta state access with checkpoint capture,

relax attention layout handling, and reuse projections in mixed

batches to reduce overhead.
2026-02-05 11:50:00 -08:00
..
2025-11-18 16:11:37 -08:00
2025-03-11 14:35:08 -07:00