Files
ollama/model/models
jmorganca c330ea33ed qwen3next: handle mixed recurrent batches
Allow mixed token-count batches by tracking per-seq indices

and falling back to per-seq recurrent processing when layouts

differ.

Add per-slot conv/delta state access with checkpoint capture,

relax attention layout handling, and reuse projections in mixed

batches to reduce overhead.
2026-02-05 11:50:00 -08:00
..
2025-12-16 15:44:52 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-08 14:42:22 -08:00
2025-12-15 17:30:33 -08:00