ollama

mirror of https://github.com/ollama/ollama.git synced 2026-02-05 05:03:21 -05:00

Files

Jesse Gross 0334ffa625 server: use tiered VRAM-based default context length

Replace binary low VRAM mode with tiered VRAM thresholds that set
default context lengths for all models:

- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context

2026-02-02 10:47:09 -08:00

config_test.go

server: use tiered VRAM-based default context length

2026-02-02 10:47:09 -08:00

config.go

server: use tiered VRAM-based default context length

2026-02-02 10:47:09 -08:00