mirror of
https://github.com/ollama/ollama.git
synced 2026-06-03 13:59:06 -04:00
This reverts commit 98e26b8c37.
The DFlash integration is too invasive to keep at this stage: it
threads DFlash-specific logic through the pipeline, base model
interfaces, and the cache layer. The recurrent cache also now
has qwen3.5 model-specific code. Revert it now and reintroduce
the self-contained, generally-useful pieces (YaRN RoPE DRY-out, draft
architecture autodetection, gated-delta fp32 state) as separate
follow-up commits.