mirror of
https://github.com/ollama/ollama.git
synced 2025-12-24 08:10:54 -05:00
Compare commits
base: mirror:timeout
mirror:main
mirror:parth/add-models-websearch
mirror:parth/prompt-renderer-mcp
mirror:jmorganca/native-settings
mirror:jmorganca/download-stream-hash
mirror:jmorganca/client2-rebased
mirror:drifkin/stable-tool-args-redux
mirror:hoyyeva/upgrade-config
mirror:brucemacd/oai-chat-req-multipart
mirror:jessegross/multi_chunk_reserve
mirror:grace/additional-omit-empty
mirror:grace/mistral-3-large
mirror:mxyng/tokenizer2
mirror:mxyng/tokenizer
mirror:jessegross/flash
mirror:hoyyeva/windows-nacked-app
mirror:mxyng/cleanup-attention
mirror:grace/deepseek-parser
mirror:hoyyeva/remember-unsent-prompt
mirror:parth/add-lfs-pointer-error-conversion
mirror:parth/olmo2-test2
mirror:hoyyeva/ollama-launchagent-plist
mirror:nicole/olmo-model
mirror:parth/olmo-test
mirror:mxyng/remove-embedded
mirror:parth/render-template
mirror:jmorganca/intellect-3
mirror:parth/remove-prealloc-linter
mirror:jmorganca/cmd-eval
mirror:nicole/nomic-embed-text-fix
mirror:mxyng/lint-2
mirror:hoyyeva/add-gemini-3-pro-preview
mirror:hoyyeva/load-model-list
mirror:mxyng/expand-path
mirror:mxyng/environ-2
mirror:hoyyeva/deeplink-json-encoding
mirror:parth/improve-tool-calling-tests
mirror:hoyyeva/conversation
mirror:hoyyeva/assistant-edit-response
mirror:hoyyeva/thinking
mirror:origin/brucemacd/invalid-char-i-err
mirror:parth/improve-tool-calling
mirror:jmorganca/required-omitempty
mirror:grace/qwen3-vl-tests
mirror:mxyng/iter-client
mirror:parth/docs-readme
mirror:nicole/embed-test
mirror:pdevine/integration-benchstat
mirror:parth/remove-generate-cmd
mirror:parth/add-toolcall-id
mirror:mxyng/server-tests
mirror:jmorganca/glm-4.6
mirror:jmorganca/gin-h-compat
mirror:drifkin/stable-tool-args
mirror:pdevine/qwen3-more-thinking
mirror:parth/add-websearch-client
mirror:nicole/websearch_local
mirror:jmorganca/qwen3-coder-updates
mirror:grace/deepseek-v3-migration-tests
mirror:mxyng/fix-create
mirror:jmorganca/cloud-errors
mirror:pdevine/parser-tidy
mirror:revert-12233-parth/simplify-entrypoints-runner
mirror:parth/enable-so-gpt-oss
mirror:brucemacd/qwen3vl
mirror:jmorganca/readme-simplify
mirror:parth/gpt-oss-structured-outputs
mirror:revert-12039-jmorganca/tools-braces
mirror:mxyng/embeddings
mirror:mxyng/gguf
mirror:mxyng/benchmark
mirror:mxyng/types-null
mirror:parth/move-parsing
mirror:mxyng/gemma2
mirror:jmorganca/docs
mirror:mxyng/16-bit
mirror:mxyng/create-stdin
mirror:pdevine/authorizedkeys
mirror:mxyng/quant
mirror:parth/opt-in-error-context-window
mirror:brucemacd/cache-models
mirror:brucemacd/runner-completion
mirror:jmorganca/llama-update-6
mirror:brucemacd/benchmark-list
mirror:brucemacd/partial-read-caps
mirror:parth/deepseek-r1-tools
mirror:mxyng/omit-array
mirror:parth/tool-prefix-temp
mirror:brucemacd/runner-test
mirror:jmorganca/qwen25vl
mirror:brucemacd/model-forward-test-ext
mirror:parth/python-function-parsing
mirror:jmorganca/cuda-compression-none
mirror:drifkin/num-parallel
mirror:drifkin/chat-truncation-fix
mirror:jmorganca/sync
mirror:parth/python-tools-calling
mirror:drifkin/array-head-count
mirror:brucemacd/create-no-loop
mirror:parth/server-enable-content-stream-with-tools
mirror:qwen25omni
mirror:mxyng/v3
mirror:brucemacd/ropeconfig
mirror:jmorganca/silence-tokenizer
mirror:parth/sample-so-test
mirror:parth/sampling-structured-outputs
mirror:brucemacd/doc-go-engine
mirror:parth/constrained-sampling-json
mirror:jmorganca/mistral-wip
mirror:brucemacd/mistral-small-convert
mirror:parth/sample-unmarshal-json-for-params
mirror:brucemacd/jomorganca/mistral
mirror:pdevine/bfloat16
mirror:jmorganca/mistral
mirror:brucemacd/mistral
mirror:pdevine/logging
mirror:parth/sample-correctness-fix
mirror:parth/sample-fix-sorting
mirror:jmorgan/sample-fix-sorting-extras
mirror:jmorganca/temp-0-images
mirror:brucemacd/parallel-embed-models
mirror:brucemacd/shim-grammar
mirror:jmorganca/fix-gguf-error
mirror:bmizerany/nameswork
mirror:jmorganca/faster-releases
mirror:bmizerany/validatenames
mirror:brucemacd/err-no-vocab
mirror:brucemacd/rope-config
mirror:brucemacd/err-hint
mirror:brucemacd/qwen2_5
mirror:brucemacd/logprobs
mirror:brucemacd/new_runner_graph_bench
mirror:progress-flicker
mirror:brucemacd/forward-test
mirror:brucemacd/go_qwen2
mirror:pdevine/gemma2
mirror:jmorganca/add-missing-symlink-eval
mirror:mxyng/next-debug
mirror:parth/set-context-size-openai
mirror:brucemacd/next-bpe-bench
mirror:brucemacd/next-bpe-test
mirror:brucemacd/new_runner_e2e
mirror:brucemacd/new_runner_qwen2
mirror:pdevine/convert-cohere2
mirror:brucemacd/convert-cli
mirror:parth/log-probs
mirror:mxyng/next-mlx
mirror:mxyng/cmd-history
mirror:parth/templating
mirror:parth/tokenize-detokenize
mirror:brucemacd/check-key-register
mirror:bmizerany/grammar
mirror:jmorganca/vendor-081b29bd
mirror:mxyng/func-checks
mirror:jmorganca/fix-null-format
mirror:parth/fix-default-to-warn-json
mirror:jmorganca/qwen2vl
mirror:jmorganca/no-concat
mirror:parth/cmd-cleanup-SO
mirror:brucemacd/check-key-register-structured-err
mirror:parth/openai-stream-usage
mirror:parth/fix-referencing-so
mirror:stream-tools-stop
mirror:jmorganca/degin-1
mirror:brucemacd/install-path-clean
mirror:brucemacd/push-name-validation
mirror:brucemacd/browser-key-register
mirror:jmorganca/openai-fix-first-message
mirror:jmorganca/fix-proxy
mirror:jessegross/sample
mirror:parth/disallow-streaming-tools
mirror:dhiltgen/remove_submodule
mirror:jmorganca/ga
mirror:jmorganca/mllama
mirror:pdevine/newlines
mirror:pdevine/geems-2b
mirror:jmorganca/llama-bump
mirror:mxyng/modelname-7
mirror:mxyng/gin-slog
mirror:mxyng/modelname-6
mirror:jyan/convert-prog
mirror:jyan/quant5
mirror:paligemma-support
mirror:pdevine/import-docs
mirror:jmorganca/openai-context
mirror:jyan/paligemma
mirror:jyan/p2
mirror:jyan/palitest
mirror:bmizerany/embedspeedup
mirror:jmorganca/llama-vit
mirror:brucemacd/allow-ollama
mirror:royh/ep-methods
mirror:royh/whisper
mirror:mxyng/api-models
mirror:mxyng/fix-memory
mirror:jyan/q4_4/8
mirror:jyan/ollama-v
mirror:royh/stream-tools
mirror:roy-embed-parallel
mirror:bmizerany/hrm
mirror:revert-5963-revert-5924-mxyng/llama3.1-rope
mirror:royh/embed-viz
mirror:jyan/local2
mirror:jyan/auth
mirror:jyan/local
mirror:jyan/parse-temp
mirror:jmorganca/template-mistral
mirror:jyan/reord-g
mirror:royh-openai-suffixdocs
mirror:royh-imgembed
mirror:royh-embed-parallel
mirror:jyan/quant4
mirror:royh-precision
mirror:jyan/progress
mirror:pdevine/fix-template
mirror:jyan/quant3
mirror:pdevine/ggla
mirror:mxyng/update-registry-domain
mirror:jmorganca/ggml-static
mirror:mxyng/create-context
mirror:jyan/v0.146
mirror:mxyng/layers-from-files
mirror:build_dist
mirror:bmizerany/noseek
mirror:royh-ls
mirror:royh-name
mirror:timeout
mirror:mxyng/server-timestamp
mirror:bmizerany/nosillyggufslurps
mirror:royh-params
mirror:jmorganca/llama-cpp-7c26775
mirror:royh-openai-delete
mirror:royh-show-rigid
mirror:jmorganca/enable-fa
mirror:jmorganca/no-error-template
mirror:jyan/format
mirror:royh-testdelete
mirror:bmizerany/fastverify
mirror:language_support
mirror:pdevine/ps-glitches
mirror:brucemacd/tokenize
mirror:bruce/iq-quants
mirror:bmizerany/filepathwithcoloninhost
mirror:mxyng/split-bin
mirror:bmizerany/client-registry
mirror:jmorganca/if-none-match
mirror:native
mirror:jmorganca/native
mirror:jmorganca/batch-embeddings
mirror:jmorganca/initcmake
mirror:jmorganca/mm
mirror:pdevine/showggmlinfo
mirror:modenameenforcealphanum
mirror:bmizerany/modenameenforcealphanum
mirror:jmorganca/done-reason
mirror:jmorganca/llama-cpp-8960fe8
mirror:ollama.com
mirror:bmizerany/filepathnobuild
mirror:bmizerany/types/model/defaultfix
mirror:rmdisplaylong
mirror:nogogen
mirror:bmizerany/x
mirror:modelfile-readme
mirror:bmizerany/replacecolon
mirror:jmorganca/limit
mirror:jmorganca/execstack
mirror:jmorganca/replace-assets
mirror:mxyng/tune-concurrency
mirror:jmorganca/testing
mirror:whitespace-detection
mirror:jmorganca/options
mirror:upgrade-all
mirror:scratch
mirror:cuda-search
mirror:mattw/airenamer
mirror:mattw/allmodelsonhuggingface
mirror:mattw/quantcontext
mirror:mattw/whatneedstorun
mirror:brucemacd/llama-mem-calc
mirror:mattw/faq-context
mirror:mattw/communitylinks
mirror:mattw/noprune
mirror:mattw/python-functioncalling
mirror:rename
mirror:mxyng/install
mirror:pulse
mirror:remove-first
mirror:editor
mirror:mattw/selfqueryingretrieval
mirror:cgo
mirror:mattw/howtoquant
mirror:api
mirror:matt/streamingapi
mirror:format-config
mirror:mxyng/extra-args
mirror:shell
mirror:update-nous-hermes
mirror:cp-model
mirror:upload-progress
mirror:fix-unknown-model
mirror:fix-model-names
mirror:delete-fix
mirror:insecure-registry
mirror:ls
mirror:deletemodels
mirror:progressbar
mirror:readme-updates
mirror:license-layers
mirror:skip-list
mirror:list-models
mirror:modelpath
mirror:matt/examplemodelfiles
mirror:distribution
mirror:go-opts
mirror:v0.13.5
mirror:v0.13.5-rc1
mirror:v0.13.5-rc0
mirror:v0.13.4-rc2
mirror:v0.13.4
mirror:v0.13.4-rc1
mirror:v0.13.4-rc0
mirror:v0.13.3
mirror:v0.13.3-rc1
mirror:v0.13.3-rc0
mirror:v0.13.2
mirror:v0.13.2-rc2
mirror:v0.13.2-rc1
mirror:v0.13.2-rc0
mirror:v0.13.1
mirror:v0.13.1-rc2
mirror:v0.13.1-rc1
mirror:v0.13.1-rc0
mirror:v0.13.0-rc0
mirror:v0.13.0
mirror:v0.12.11
mirror:v0.12.11-rc1
mirror:v0.12.11-rc0
mirror:v0.12.10-rc1
mirror:v0.12.10
mirror:v0.12.10-rc0
mirror:v0.12.9
mirror:v0.12.9-rc0
mirror:v0.12.8-rc0
mirror:v0.12.8
mirror:v0.12.7
mirror:v0.12.7-rc1
mirror:v0.12.7-rc0
mirror:v0.12.6
mirror:v0.12.6-rc1
mirror:v0.12.6-rc0
mirror:v0.12.5
mirror:v0.12.5-rc0
mirror:v0.12.4
mirror:v0.12.4-rc7
mirror:v0.12.4-rc6
mirror:v0.12.4-rc5
mirror:v0.12.4-rc4
mirror:v0.12.4-rc3
mirror:v0.12.4-rc2
mirror:v0.12.4-rc1
mirror:v0.12.4-rc0
mirror:v0.12.3
mirror:v0.12.2
mirror:v0.12.2-rc0
mirror:v0.12.1
mirror:v0.12.1-rc1
mirror:v0.12.1-rc2
mirror:v0.12.1-rc0
mirror:v0.12.0-rc1
mirror:v0.12.0
mirror:v0.12.0-rc0
mirror:v0.11.11-rc3
mirror:v0.11.11-rc2
mirror:v0.11.11
mirror:v0.11.11-rc1
mirror:v0.11.11-rc0
mirror:v0.11.10
mirror:v0.11.9
mirror:v0.11.9-rc0
mirror:v0.11.8
mirror:v0.11.8-rc0
mirror:v0.11.7
mirror:v0.11.7-rc0
mirror:v0.11.7-rc1
mirror:v0.11.6
mirror:v0.11.6-rc0
mirror:v0.11.5-rc3
mirror:v0.11.5-rc4
mirror:v0.11.5-rc5
mirror:v0.11.5
mirror:v0.11.5-rc2
mirror:v0.11.5-rc1
mirror:v0.11.5-rc0
mirror:v0.11.4
mirror:v0.11.4-rc0
mirror:v0.11.3
mirror:v0.11.3-rc0
mirror:v0.11.2
mirror:v0.11.1
mirror:v0.11.0
mirror:v0.10.1
mirror:v0.10.0
mirror:v0.10.0-rc4
mirror:v0.10.0-rc3
mirror:v0.10.0-rc2
mirror:v0.10.0-rc1
mirror:v0.10.0-rc0
mirror:v0.9.7-rc1
mirror:v0.9.7-rc0
mirror:v0.9.6
mirror:v0.9.6-rc0
mirror:v0.9.5
mirror:v0.9.4
mirror:v0.9.4-rc3
mirror:v0.9.4-rc4
mirror:v0.9.4-rc5
mirror:v0.9.4-rc6
mirror:v0.9.4-rc2
mirror:v0.9.4-rc1
mirror:v0.9.4-rc0
mirror:v0.9.3
mirror:v0.9.3-rc5
mirror:v0.9.4-citest0
mirror:v0.9.3-rc4
mirror:v0.9.3-rc3
mirror:v0.9.3-rc2
mirror:v0.9.3-rc1
mirror:v0.9.3-rc0
mirror:v0.9.2
mirror:v0.9.1
mirror:v0.9.1-rc1
mirror:v0.9.1-rc0
mirror:v0.9.0
mirror:v0.9.0-rc0
mirror:v0.8.0
mirror:v0.8.0-rc0
mirror:v0.7.1
mirror:v0.7.1-rc2
mirror:v0.7.1-rc1
mirror:v0.7.1-rc0
mirror:v0.7.0
mirror:v0.7.0-rc1
mirror:v0.7.0-rc0
mirror:v0.6.8
mirror:v0.6.8-rc0
mirror:v0.6.7
mirror:v0.6.7-rc2
mirror:v0.6.7-rc1
mirror:v0.6.7-rc0
mirror:v0.6.6
mirror:v0.6.6-rc2
mirror:v0.6.6-rc1
mirror:v0.6.6-rc0
mirror:v0.6.5
mirror:v0.6.5-rc1
mirror:v0.6.5-rc0
mirror:v0.6.4
mirror:v0.6.4-rc0
mirror:v0.6.3-rc1
mirror:v0.6.3
mirror:v0.6.3-rc0
mirror:v0.6.2-rc0
mirror:v0.6.2
mirror:v0.6.1
mirror:v0.6.1-rc0
mirror:v0.6.0
mirror:v0.6.0-rc0
mirror:v0.5.13
mirror:v0.5.13-rc6
mirror:v0.5.13-rc5
mirror:v0.5.13-rc4
mirror:v0.5.13-rc3
mirror:v0.5.13-rc2
mirror:v0.5.13-rc1
mirror:v0.5.13-rc0
mirror:v0.5.12
mirror:v0.5.12-rc1
mirror:v0.5.12-rc0
mirror:v0.5.11
mirror:v0.5.10
mirror:v0.5.9
mirror:v0.5.9-rc0
mirror:v0.5.8
mirror:v0.5.8-rc13
mirror:v0.5.8-rc12
mirror:v0.5.8-rc11
mirror:v0.5.8-rc10
mirror:v0.5.8-rc9
mirror:v0.5.8-rc8
mirror:v0.5.8-rc7
mirror:v0.5.8-rc6
mirror:v0.5.8-rc5
mirror:v0.5.8-rc4
mirror:v0.5.8-rc3
mirror:v0.5.8-rc2
mirror:v0.5.8-rc1
mirror:v0.5.8-rc0
mirror:v0.5.7
mirror:v0.5.6
mirror:v0.5.5
mirror:v0.5.5-rc0
mirror:v0.5.4
mirror:v0.5.3
mirror:v0.5.3-rc0
mirror:v0.5.2
mirror:v0.5.2-rc3
mirror:v0.5.2-rc2
mirror:v0.5.2-rc1
mirror:v0.5.2-rc0
mirror:v0.5.1
mirror:v0.5.0
mirror:v0.5.0-rc1
mirror:v0.4.8-rc0
mirror:v0.4.7
mirror:v0.4.6
mirror:v0.4.5
mirror:v0.4.4
mirror:v0.4.3
mirror:v0.4.3-rc0
mirror:v0.4.2
mirror:v0.4.2-rc1
mirror:v0.4.2-rc0
mirror:v0.4.1-rc0
mirror:v0.4.1
mirror:v0.4.0
mirror:v0.4.0-rc8
mirror:v0.4.0-rc7
mirror:v0.4.0-rc6
mirror:v0.4.0-rc5
mirror:v0.4.0-rc4
mirror:v0.4.0-rc3
mirror:v0.4.0-rc2
mirror:v0.4.0-rc1
mirror:v0.4.0-rc0
mirror:v0.4.0-ci3
mirror:v0.3.14-rc0
mirror:v0.3.14
mirror:v0.3.13
mirror:v0.3.12
mirror:v0.3.12-rc5
mirror:v0.3.12-rc4
mirror:v0.3.12-rc3
mirror:v0.3.12-rc2
mirror:v0.3.12-rc1
mirror:v0.3.11
mirror:v0.3.11-rc4
mirror:v0.3.11-rc3
mirror:v0.3.11-rc2
mirror:v0.3.11-rc1
mirror:v0.3.10
mirror:v0.3.10-rc1
mirror:v0.3.9
mirror:v0.3.8
mirror:v0.3.7
mirror:v0.3.7-rc6
mirror:v0.3.7-rc5
mirror:v0.3.7-rc4
mirror:v0.3.7-rc3
mirror:v0.3.7-rc2
mirror:v0.3.7-rc1
mirror:v0.3.6
mirror:v0.3.5
mirror:v0.3.4
mirror:v0.3.3
mirror:v0.3.2
mirror:v0.3.1
mirror:v0.3.0
mirror:v0.2.8
mirror:v0.2.8-rc2
mirror:v0.2.8-rc1
mirror:v0.2.7
mirror:v0.2.6
mirror:v0.2.5
mirror:v0.2.4
mirror:v0.2.3
mirror:v0.2.2
mirror:v0.2.2-rc2
mirror:v0.2.2-rc1
mirror:v0.2.1
mirror:v0.2.0
mirror:v0.1.49-rc14
mirror:v0.1.49-rc13
mirror:v0.1.49-rc12
mirror:v0.1.49-rc11
mirror:v0.1.49-rc10
mirror:v0.1.49-rc9
mirror:v0.1.49-rc8
mirror:v0.1.49-rc7
mirror:v0.1.49-rc6
mirror:v0.1.49-rc4
mirror:v0.1.49-rc5
mirror:v0.1.49-rc3
mirror:v0.1.49-rc2
mirror:v0.1.49-rc1
mirror:v0.1.48
mirror:v0.1.47
mirror:v0.1.46
mirror:v0.1.45-rc5
mirror:v0.1.45
mirror:v0.1.45-rc4
mirror:v0.1.45-rc3
mirror:v0.1.45-rc2
mirror:v0.1.45-rc1
mirror:v0.1.44
mirror:v0.1.43
mirror:v0.1.42
mirror:v0.1.41
mirror:v0.1.40
mirror:v0.1.40-rc1
mirror:v0.1.39
mirror:v0.1.39-rc2
mirror:v0.1.39-rc1
mirror:v0.1.38
mirror:v0.1.37
mirror:v0.1.36
mirror:v0.1.35
mirror:v0.1.35-rc1
mirror:v0.1.34
mirror:v0.1.34-rc1
mirror:v0.1.33
mirror:v0.1.33-rc7
mirror:v0.1.33-rc6
mirror:v0.1.33-rc5
mirror:v0.1.33-rc4
mirror:v0.1.33-rc3
mirror:v0.1.33-rc2
mirror:v0.1.33-rc1
mirror:v0.1.32
mirror:v0.1.32-rc2
mirror:v0.1.32-rc1
mirror:v0.1.31
mirror:v0.1.30
mirror:v0.1.29
mirror:v0.1.28
mirror:v0.1.27
mirror:v0.1.26
mirror:v0.1.25
mirror:v0.1.24
mirror:v0.1.23
mirror:v0.1.22
mirror:v0.1.21
mirror:v0.1.20
mirror:v0.1.19
mirror:v0.1.18
mirror:v0.1.17
mirror:v0.1.16
mirror:v0.1.15
mirror:v0.1.14
mirror:v0.1.13
mirror:v0.1.12
mirror:v0.1.11
mirror:v0.1.10
mirror:v0.1.9
mirror:v0.1.8
mirror:v0.1.7
mirror:v0.1.6
mirror:v0.1.5
mirror:v0.1.4
mirror:v0.1.3
mirror:v0.1.2
mirror:v0.1.1
mirror:v0.1.0
mirror:v0.0.21
mirror:v0.0.20
mirror:v0.0.19
mirror:v0.0.18
mirror:v0.0.17
mirror:v0.0.16
mirror:v0.0.15
mirror:v0.0.14
mirror:v0.0.13
mirror:v0.0.12
mirror:v0.0.11
mirror:v0.0.10
mirror:v0.0.9
mirror:v0.0.8
mirror:v0.0.7
mirror:v0.0.6
mirror:v0.0.5
mirror:v0.0.4
mirror:v0.0.3
mirror:v0.0.2
mirror:v0.0.1
...
compare: mirror:mattw/quantcontext
mirror:parth/add-models-websearch
mirror:parth/prompt-renderer-mcp
mirror:main
mirror:jmorganca/native-settings
mirror:jmorganca/download-stream-hash
mirror:jmorganca/client2-rebased
mirror:drifkin/stable-tool-args-redux
mirror:hoyyeva/upgrade-config
mirror:brucemacd/oai-chat-req-multipart
mirror:jessegross/multi_chunk_reserve
mirror:grace/additional-omit-empty
mirror:grace/mistral-3-large
mirror:mxyng/tokenizer2
mirror:mxyng/tokenizer
mirror:jessegross/flash
mirror:hoyyeva/windows-nacked-app
mirror:mxyng/cleanup-attention
mirror:grace/deepseek-parser
mirror:hoyyeva/remember-unsent-prompt
mirror:parth/add-lfs-pointer-error-conversion
mirror:parth/olmo2-test2
mirror:hoyyeva/ollama-launchagent-plist
mirror:nicole/olmo-model
mirror:parth/olmo-test
mirror:mxyng/remove-embedded
mirror:parth/render-template
mirror:jmorganca/intellect-3
mirror:parth/remove-prealloc-linter
mirror:jmorganca/cmd-eval
mirror:nicole/nomic-embed-text-fix
mirror:mxyng/lint-2
mirror:hoyyeva/add-gemini-3-pro-preview
mirror:hoyyeva/load-model-list
mirror:mxyng/expand-path
mirror:mxyng/environ-2
mirror:hoyyeva/deeplink-json-encoding
mirror:parth/improve-tool-calling-tests
mirror:hoyyeva/conversation
mirror:hoyyeva/assistant-edit-response
mirror:hoyyeva/thinking
mirror:origin/brucemacd/invalid-char-i-err
mirror:parth/improve-tool-calling
mirror:jmorganca/required-omitempty
mirror:grace/qwen3-vl-tests
mirror:mxyng/iter-client
mirror:parth/docs-readme
mirror:nicole/embed-test
mirror:pdevine/integration-benchstat
mirror:parth/remove-generate-cmd
mirror:parth/add-toolcall-id
mirror:mxyng/server-tests
mirror:jmorganca/glm-4.6
mirror:jmorganca/gin-h-compat
mirror:drifkin/stable-tool-args
mirror:pdevine/qwen3-more-thinking
mirror:parth/add-websearch-client
mirror:nicole/websearch_local
mirror:jmorganca/qwen3-coder-updates
mirror:grace/deepseek-v3-migration-tests
mirror:mxyng/fix-create
mirror:jmorganca/cloud-errors
mirror:pdevine/parser-tidy
mirror:revert-12233-parth/simplify-entrypoints-runner
mirror:parth/enable-so-gpt-oss
mirror:brucemacd/qwen3vl
mirror:jmorganca/readme-simplify
mirror:parth/gpt-oss-structured-outputs
mirror:revert-12039-jmorganca/tools-braces
mirror:mxyng/embeddings
mirror:mxyng/gguf
mirror:mxyng/benchmark
mirror:mxyng/types-null
mirror:parth/move-parsing
mirror:mxyng/gemma2
mirror:jmorganca/docs
mirror:mxyng/16-bit
mirror:mxyng/create-stdin
mirror:pdevine/authorizedkeys
mirror:mxyng/quant
mirror:parth/opt-in-error-context-window
mirror:brucemacd/cache-models
mirror:brucemacd/runner-completion
mirror:jmorganca/llama-update-6
mirror:brucemacd/benchmark-list
mirror:brucemacd/partial-read-caps
mirror:parth/deepseek-r1-tools
mirror:mxyng/omit-array
mirror:parth/tool-prefix-temp
mirror:brucemacd/runner-test
mirror:jmorganca/qwen25vl
mirror:brucemacd/model-forward-test-ext
mirror:parth/python-function-parsing
mirror:jmorganca/cuda-compression-none
mirror:drifkin/num-parallel
mirror:drifkin/chat-truncation-fix
mirror:jmorganca/sync
mirror:parth/python-tools-calling
mirror:drifkin/array-head-count
mirror:brucemacd/create-no-loop
mirror:parth/server-enable-content-stream-with-tools
mirror:qwen25omni
mirror:mxyng/v3
mirror:brucemacd/ropeconfig
mirror:jmorganca/silence-tokenizer
mirror:parth/sample-so-test
mirror:parth/sampling-structured-outputs
mirror:brucemacd/doc-go-engine
mirror:parth/constrained-sampling-json
mirror:jmorganca/mistral-wip
mirror:brucemacd/mistral-small-convert
mirror:parth/sample-unmarshal-json-for-params
mirror:brucemacd/jomorganca/mistral
mirror:pdevine/bfloat16
mirror:jmorganca/mistral
mirror:brucemacd/mistral
mirror:pdevine/logging
mirror:parth/sample-correctness-fix
mirror:parth/sample-fix-sorting
mirror:jmorgan/sample-fix-sorting-extras
mirror:jmorganca/temp-0-images
mirror:brucemacd/parallel-embed-models
mirror:brucemacd/shim-grammar
mirror:jmorganca/fix-gguf-error
mirror:bmizerany/nameswork
mirror:jmorganca/faster-releases
mirror:bmizerany/validatenames
mirror:brucemacd/err-no-vocab
mirror:brucemacd/rope-config
mirror:brucemacd/err-hint
mirror:brucemacd/qwen2_5
mirror:brucemacd/logprobs
mirror:brucemacd/new_runner_graph_bench
mirror:progress-flicker
mirror:brucemacd/forward-test
mirror:brucemacd/go_qwen2
mirror:pdevine/gemma2
mirror:jmorganca/add-missing-symlink-eval
mirror:mxyng/next-debug
mirror:parth/set-context-size-openai
mirror:brucemacd/next-bpe-bench
mirror:brucemacd/next-bpe-test
mirror:brucemacd/new_runner_e2e
mirror:brucemacd/new_runner_qwen2
mirror:pdevine/convert-cohere2
mirror:brucemacd/convert-cli
mirror:parth/log-probs
mirror:mxyng/next-mlx
mirror:mxyng/cmd-history
mirror:parth/templating
mirror:parth/tokenize-detokenize
mirror:brucemacd/check-key-register
mirror:bmizerany/grammar
mirror:jmorganca/vendor-081b29bd
mirror:mxyng/func-checks
mirror:jmorganca/fix-null-format
mirror:parth/fix-default-to-warn-json
mirror:jmorganca/qwen2vl
mirror:jmorganca/no-concat
mirror:parth/cmd-cleanup-SO
mirror:brucemacd/check-key-register-structured-err
mirror:parth/openai-stream-usage
mirror:parth/fix-referencing-so
mirror:stream-tools-stop
mirror:jmorganca/degin-1
mirror:brucemacd/install-path-clean
mirror:brucemacd/push-name-validation
mirror:brucemacd/browser-key-register
mirror:jmorganca/openai-fix-first-message
mirror:jmorganca/fix-proxy
mirror:jessegross/sample
mirror:parth/disallow-streaming-tools
mirror:dhiltgen/remove_submodule
mirror:jmorganca/ga
mirror:jmorganca/mllama
mirror:pdevine/newlines
mirror:pdevine/geems-2b
mirror:jmorganca/llama-bump
mirror:mxyng/modelname-7
mirror:mxyng/gin-slog
mirror:mxyng/modelname-6
mirror:jyan/convert-prog
mirror:jyan/quant5
mirror:paligemma-support
mirror:pdevine/import-docs
mirror:jmorganca/openai-context
mirror:jyan/paligemma
mirror:jyan/p2
mirror:jyan/palitest
mirror:bmizerany/embedspeedup
mirror:jmorganca/llama-vit
mirror:brucemacd/allow-ollama
mirror:royh/ep-methods
mirror:royh/whisper
mirror:mxyng/api-models
mirror:mxyng/fix-memory
mirror:jyan/q4_4/8
mirror:jyan/ollama-v
mirror:royh/stream-tools
mirror:roy-embed-parallel
mirror:bmizerany/hrm
mirror:revert-5963-revert-5924-mxyng/llama3.1-rope
mirror:royh/embed-viz
mirror:jyan/local2
mirror:jyan/auth
mirror:jyan/local
mirror:jyan/parse-temp
mirror:jmorganca/template-mistral
mirror:jyan/reord-g
mirror:royh-openai-suffixdocs
mirror:royh-imgembed
mirror:royh-embed-parallel
mirror:jyan/quant4
mirror:royh-precision
mirror:jyan/progress
mirror:pdevine/fix-template
mirror:jyan/quant3
mirror:pdevine/ggla
mirror:mxyng/update-registry-domain
mirror:jmorganca/ggml-static
mirror:mxyng/create-context
mirror:jyan/v0.146
mirror:mxyng/layers-from-files
mirror:build_dist
mirror:bmizerany/noseek
mirror:royh-ls
mirror:royh-name
mirror:timeout
mirror:mxyng/server-timestamp
mirror:bmizerany/nosillyggufslurps
mirror:royh-params
mirror:jmorganca/llama-cpp-7c26775
mirror:royh-openai-delete
mirror:royh-show-rigid
mirror:jmorganca/enable-fa
mirror:jmorganca/no-error-template
mirror:jyan/format
mirror:royh-testdelete
mirror:bmizerany/fastverify
mirror:language_support
mirror:pdevine/ps-glitches
mirror:brucemacd/tokenize
mirror:bruce/iq-quants
mirror:bmizerany/filepathwithcoloninhost
mirror:mxyng/split-bin
mirror:bmizerany/client-registry
mirror:jmorganca/if-none-match
mirror:native
mirror:jmorganca/native
mirror:jmorganca/batch-embeddings
mirror:jmorganca/initcmake
mirror:jmorganca/mm
mirror:pdevine/showggmlinfo
mirror:modenameenforcealphanum
mirror:bmizerany/modenameenforcealphanum
mirror:jmorganca/done-reason
mirror:jmorganca/llama-cpp-8960fe8
mirror:ollama.com
mirror:bmizerany/filepathnobuild
mirror:bmizerany/types/model/defaultfix
mirror:rmdisplaylong
mirror:nogogen
mirror:bmizerany/x
mirror:modelfile-readme
mirror:bmizerany/replacecolon
mirror:jmorganca/limit
mirror:jmorganca/execstack
mirror:jmorganca/replace-assets
mirror:mxyng/tune-concurrency
mirror:jmorganca/testing
mirror:whitespace-detection
mirror:jmorganca/options
mirror:upgrade-all
mirror:scratch
mirror:cuda-search
mirror:mattw/airenamer
mirror:mattw/allmodelsonhuggingface
mirror:mattw/quantcontext
mirror:mattw/whatneedstorun
mirror:brucemacd/llama-mem-calc
mirror:mattw/faq-context
mirror:mattw/communitylinks
mirror:mattw/noprune
mirror:mattw/python-functioncalling
mirror:rename
mirror:mxyng/install
mirror:pulse
mirror:remove-first
mirror:editor
mirror:mattw/selfqueryingretrieval
mirror:cgo
mirror:mattw/howtoquant
mirror:api
mirror:matt/streamingapi
mirror:format-config
mirror:mxyng/extra-args
mirror:shell
mirror:update-nous-hermes
mirror:cp-model
mirror:upload-progress
mirror:fix-unknown-model
mirror:fix-model-names
mirror:delete-fix
mirror:insecure-registry
mirror:ls
mirror:deletemodels
mirror:progressbar
mirror:readme-updates
mirror:license-layers
mirror:skip-list
mirror:list-models
mirror:modelpath
mirror:matt/examplemodelfiles
mirror:distribution
mirror:go-opts
mirror:v0.13.5
mirror:v0.13.5-rc1
mirror:v0.13.5-rc0
mirror:v0.13.4-rc2
mirror:v0.13.4
mirror:v0.13.4-rc1
mirror:v0.13.4-rc0
mirror:v0.13.3
mirror:v0.13.3-rc1
mirror:v0.13.3-rc0
mirror:v0.13.2
mirror:v0.13.2-rc2
mirror:v0.13.2-rc1
mirror:v0.13.2-rc0
mirror:v0.13.1
mirror:v0.13.1-rc2
mirror:v0.13.1-rc1
mirror:v0.13.1-rc0
mirror:v0.13.0-rc0
mirror:v0.13.0
mirror:v0.12.11
mirror:v0.12.11-rc1
mirror:v0.12.11-rc0
mirror:v0.12.10-rc1
mirror:v0.12.10
mirror:v0.12.10-rc0
mirror:v0.12.9
mirror:v0.12.9-rc0
mirror:v0.12.8-rc0
mirror:v0.12.8
mirror:v0.12.7
mirror:v0.12.7-rc1
mirror:v0.12.7-rc0
mirror:v0.12.6
mirror:v0.12.6-rc1
mirror:v0.12.6-rc0
mirror:v0.12.5
mirror:v0.12.5-rc0
mirror:v0.12.4
mirror:v0.12.4-rc7
mirror:v0.12.4-rc6
mirror:v0.12.4-rc5
mirror:v0.12.4-rc4
mirror:v0.12.4-rc3
mirror:v0.12.4-rc2
mirror:v0.12.4-rc1
mirror:v0.12.4-rc0
mirror:v0.12.3
mirror:v0.12.2
mirror:v0.12.2-rc0
mirror:v0.12.1
mirror:v0.12.1-rc1
mirror:v0.12.1-rc2
mirror:v0.12.1-rc0
mirror:v0.12.0-rc1
mirror:v0.12.0
mirror:v0.12.0-rc0
mirror:v0.11.11-rc3
mirror:v0.11.11-rc2
mirror:v0.11.11
mirror:v0.11.11-rc1
mirror:v0.11.11-rc0
mirror:v0.11.10
mirror:v0.11.9
mirror:v0.11.9-rc0
mirror:v0.11.8
mirror:v0.11.8-rc0
mirror:v0.11.7
mirror:v0.11.7-rc0
mirror:v0.11.7-rc1
mirror:v0.11.6
mirror:v0.11.6-rc0
mirror:v0.11.5-rc3
mirror:v0.11.5-rc4
mirror:v0.11.5-rc5
mirror:v0.11.5
mirror:v0.11.5-rc2
mirror:v0.11.5-rc1
mirror:v0.11.5-rc0
mirror:v0.11.4
mirror:v0.11.4-rc0
mirror:v0.11.3
mirror:v0.11.3-rc0
mirror:v0.11.2
mirror:v0.11.1
mirror:v0.11.0
mirror:v0.10.1
mirror:v0.10.0
mirror:v0.10.0-rc4
mirror:v0.10.0-rc3
mirror:v0.10.0-rc2
mirror:v0.10.0-rc1
mirror:v0.10.0-rc0
mirror:v0.9.7-rc1
mirror:v0.9.7-rc0
mirror:v0.9.6
mirror:v0.9.6-rc0
mirror:v0.9.5
mirror:v0.9.4
mirror:v0.9.4-rc3
mirror:v0.9.4-rc4
mirror:v0.9.4-rc5
mirror:v0.9.4-rc6
mirror:v0.9.4-rc2
mirror:v0.9.4-rc1
mirror:v0.9.4-rc0
mirror:v0.9.3
mirror:v0.9.3-rc5
mirror:v0.9.4-citest0
mirror:v0.9.3-rc4
mirror:v0.9.3-rc3
mirror:v0.9.3-rc2
mirror:v0.9.3-rc1
mirror:v0.9.3-rc0
mirror:v0.9.2
mirror:v0.9.1
mirror:v0.9.1-rc1
mirror:v0.9.1-rc0
mirror:v0.9.0
mirror:v0.9.0-rc0
mirror:v0.8.0
mirror:v0.8.0-rc0
mirror:v0.7.1
mirror:v0.7.1-rc2
mirror:v0.7.1-rc1
mirror:v0.7.1-rc0
mirror:v0.7.0
mirror:v0.7.0-rc1
mirror:v0.7.0-rc0
mirror:v0.6.8
mirror:v0.6.8-rc0
mirror:v0.6.7
mirror:v0.6.7-rc2
mirror:v0.6.7-rc1
mirror:v0.6.7-rc0
mirror:v0.6.6
mirror:v0.6.6-rc2
mirror:v0.6.6-rc1
mirror:v0.6.6-rc0
mirror:v0.6.5
mirror:v0.6.5-rc1
mirror:v0.6.5-rc0
mirror:v0.6.4
mirror:v0.6.4-rc0
mirror:v0.6.3-rc1
mirror:v0.6.3
mirror:v0.6.3-rc0
mirror:v0.6.2-rc0
mirror:v0.6.2
mirror:v0.6.1
mirror:v0.6.1-rc0
mirror:v0.6.0
mirror:v0.6.0-rc0
mirror:v0.5.13
mirror:v0.5.13-rc6
mirror:v0.5.13-rc5
mirror:v0.5.13-rc4
mirror:v0.5.13-rc3
mirror:v0.5.13-rc2
mirror:v0.5.13-rc1
mirror:v0.5.13-rc0
mirror:v0.5.12
mirror:v0.5.12-rc1
mirror:v0.5.12-rc0
mirror:v0.5.11
mirror:v0.5.10
mirror:v0.5.9
mirror:v0.5.9-rc0
mirror:v0.5.8
mirror:v0.5.8-rc13
mirror:v0.5.8-rc12
mirror:v0.5.8-rc11
mirror:v0.5.8-rc10
mirror:v0.5.8-rc9
mirror:v0.5.8-rc8
mirror:v0.5.8-rc7
mirror:v0.5.8-rc6
mirror:v0.5.8-rc5
mirror:v0.5.8-rc4
mirror:v0.5.8-rc3
mirror:v0.5.8-rc2
mirror:v0.5.8-rc1
mirror:v0.5.8-rc0
mirror:v0.5.7
mirror:v0.5.6
mirror:v0.5.5
mirror:v0.5.5-rc0
mirror:v0.5.4
mirror:v0.5.3
mirror:v0.5.3-rc0
mirror:v0.5.2
mirror:v0.5.2-rc3
mirror:v0.5.2-rc2
mirror:v0.5.2-rc1
mirror:v0.5.2-rc0
mirror:v0.5.1
mirror:v0.5.0
mirror:v0.5.0-rc1
mirror:v0.4.8-rc0
mirror:v0.4.7
mirror:v0.4.6
mirror:v0.4.5
mirror:v0.4.4
mirror:v0.4.3
mirror:v0.4.3-rc0
mirror:v0.4.2
mirror:v0.4.2-rc1
mirror:v0.4.2-rc0
mirror:v0.4.1-rc0
mirror:v0.4.1
mirror:v0.4.0
mirror:v0.4.0-rc8
mirror:v0.4.0-rc7
mirror:v0.4.0-rc6
mirror:v0.4.0-rc5
mirror:v0.4.0-rc4
mirror:v0.4.0-rc3
mirror:v0.4.0-rc2
mirror:v0.4.0-rc1
mirror:v0.4.0-rc0
mirror:v0.4.0-ci3
mirror:v0.3.14-rc0
mirror:v0.3.14
mirror:v0.3.13
mirror:v0.3.12
mirror:v0.3.12-rc5
mirror:v0.3.12-rc4
mirror:v0.3.12-rc3
mirror:v0.3.12-rc2
mirror:v0.3.12-rc1
mirror:v0.3.11
mirror:v0.3.11-rc4
mirror:v0.3.11-rc3
mirror:v0.3.11-rc2
mirror:v0.3.11-rc1
mirror:v0.3.10
mirror:v0.3.10-rc1
mirror:v0.3.9
mirror:v0.3.8
mirror:v0.3.7
mirror:v0.3.7-rc6
mirror:v0.3.7-rc5
mirror:v0.3.7-rc4
mirror:v0.3.7-rc3
mirror:v0.3.7-rc2
mirror:v0.3.7-rc1
mirror:v0.3.6
mirror:v0.3.5
mirror:v0.3.4
mirror:v0.3.3
mirror:v0.3.2
mirror:v0.3.1
mirror:v0.3.0
mirror:v0.2.8
mirror:v0.2.8-rc2
mirror:v0.2.8-rc1
mirror:v0.2.7
mirror:v0.2.6
mirror:v0.2.5
mirror:v0.2.4
mirror:v0.2.3
mirror:v0.2.2
mirror:v0.2.2-rc2
mirror:v0.2.2-rc1
mirror:v0.2.1
mirror:v0.2.0
mirror:v0.1.49-rc14
mirror:v0.1.49-rc13
mirror:v0.1.49-rc12
mirror:v0.1.49-rc11
mirror:v0.1.49-rc10
mirror:v0.1.49-rc9
mirror:v0.1.49-rc8
mirror:v0.1.49-rc7
mirror:v0.1.49-rc6
mirror:v0.1.49-rc4
mirror:v0.1.49-rc5
mirror:v0.1.49-rc3
mirror:v0.1.49-rc2
mirror:v0.1.49-rc1
mirror:v0.1.48
mirror:v0.1.47
mirror:v0.1.46
mirror:v0.1.45-rc5
mirror:v0.1.45
mirror:v0.1.45-rc4
mirror:v0.1.45-rc3
mirror:v0.1.45-rc2
mirror:v0.1.45-rc1
mirror:v0.1.44
mirror:v0.1.43
mirror:v0.1.42
mirror:v0.1.41
mirror:v0.1.40
mirror:v0.1.40-rc1
mirror:v0.1.39
mirror:v0.1.39-rc2
mirror:v0.1.39-rc1
mirror:v0.1.38
mirror:v0.1.37
mirror:v0.1.36
mirror:v0.1.35
mirror:v0.1.35-rc1
mirror:v0.1.34
mirror:v0.1.34-rc1
mirror:v0.1.33
mirror:v0.1.33-rc7
mirror:v0.1.33-rc6
mirror:v0.1.33-rc5
mirror:v0.1.33-rc4
mirror:v0.1.33-rc3
mirror:v0.1.33-rc2
mirror:v0.1.33-rc1
mirror:v0.1.32
mirror:v0.1.32-rc2
mirror:v0.1.32-rc1
mirror:v0.1.31
mirror:v0.1.30
mirror:v0.1.29
mirror:v0.1.28
mirror:v0.1.27
mirror:v0.1.26
mirror:v0.1.25
mirror:v0.1.24
mirror:v0.1.23
mirror:v0.1.22
mirror:v0.1.21
mirror:v0.1.20
mirror:v0.1.19
mirror:v0.1.18
mirror:v0.1.17
mirror:v0.1.16
mirror:v0.1.15
mirror:v0.1.14
mirror:v0.1.13
mirror:v0.1.12
mirror:v0.1.11
mirror:v0.1.10
mirror:v0.1.9
mirror:v0.1.8
mirror:v0.1.7
mirror:v0.1.6
mirror:v0.1.5
mirror:v0.1.4
mirror:v0.1.3
mirror:v0.1.2
mirror:v0.1.1
mirror:v0.1.0
mirror:v0.0.21
mirror:v0.0.20
mirror:v0.0.19
mirror:v0.0.18
mirror:v0.0.17
mirror:v0.0.16
mirror:v0.0.15
mirror:v0.0.14
mirror:v0.0.13
mirror:v0.0.12
mirror:v0.0.11
mirror:v0.0.10
mirror:v0.0.9
mirror:v0.0.8
mirror:v0.0.7
mirror:v0.0.6
mirror:v0.0.5
mirror:v0.0.4
mirror:v0.0.3
mirror:v0.0.2
mirror:v0.0.1
2 Commits
timeout
...
mattw/quan
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
fed3843be2 |
update to resolve jmorganca comments
Signed-off-by: Matt Williams <m@technovangelist.com> |
||
|
|
01d4047ed3 |
add faq about quant and context
Signed-off-by: Matt Williams <m@technovangelist.com> |
1 changed files with 23 additions and 0 deletions
23
docs/faq.md
23
docs/faq.md
@@ -112,3 +112,26 @@ This can impact both installing Ollama, as well as downloading models.
|
||||
Open `Control Panel > Networking and Internet > View network status and tasks` and click on `Change adapter settings` on the left panel. Find the `vEthernel (WSL)` adapter, right click and select `Properties`.
|
||||
Click on `Configure` and open the `Advanced` tab. Search through each of the properties until you find `Large Send Offload Version 2 (IPv4)` and `Large Send Offload Version 2 (IPv6)`. *Disable* both of these
|
||||
properties.
|
||||
|
||||
## What does the q in the model tag mean? What is quantization?
|
||||
|
||||
Whenever you pull a model without a tag, Ollama will actually pull the q4_0 quantization of the model. You can verify this on the tags page. On https://ollama.ai/library/llama2/tags you can see that the hash for the latest tag matches the hash for the 7b model. 
|
||||
|
||||
Looking at the that page for any model, you can see several quantization options available. Quantization is a method of compression that allows the model to fit in less space and thus use less RAM and VRAM on your machine.
|
||||
|
||||
At a high level, a model is made of an enormous collection of nodes that determine how to generate text. These nodes are connected at different levels with weights. The training process adjusts these weights to be able to output the right text every time.
|
||||
|
||||
Most of the source models that we use start with weights that are 32bit floating-point numbers. Those weights, and another concept called biases, add up to be the parameters. So a source model with 7 billion parameters has 7 billion 32bit floating-point numbers, plus a description of all the nodes and more. That adds up to needing at least 28 Gigabytes of memory to load, if you choose to load one of those source models.
|
||||
|
||||
Quantization turns those 32bit floating point weights into much smaller integers. The number next to the q indicates the bit size of the weights. So a q4 model converted those 32bit floats into 4bit integers. A 4bit quantization takes up the space for 7billion 4bit integers, plus a little overhead. That comes out to almost 4 Gigabytes. Obviously, there is some loss of information in this process of going from 30GB to 4GB, but it turns out in most cases it isn't really noticeable. In fact, even the 2bit quantization which fits in less than 3GB can be very useful.
|
||||
|
||||
There are three major sets of quantizations you will see in the Ollama Library of models: **fp16**, models with just a q and a number, like **q4_0**, and then models with a **K** in the tag. The **fp16** model is one that has been converted and quantized from the source 32bit to 16bit. This will be about half the size of the 32bit source model and is the largest quantization we deliver in the library. The **q4_0**, **q4_1**, **q5_0**, etc. models use two different quantization methods that were the original methods.
|
||||
|
||||
The models with a **K** are often referred to as K Quants. This is a method that allows for models of a similar quality but smaller than the original method used. Essentially, it finds clusters of weights and quantizes those together, allowing for higher precision while using the same bit sizes as the regular quantization options. But this requires a set of maps for the model to figure out the original values which have a computational cost. You may see some impact on the speed of models with K quants compared to the regular quantizations.
|
||||
|
||||
## What is context, can I increase it, and why doesn't every model support a huge context?
|
||||
|
||||
Context refers to the size of the input you can send to a model and get sensible output back. Many models have a context size of 2048 tokens. It's sometimes possible to give it more using the **num_ctx** parameter, but the answers start to degrade. This is because half of the context is "freed" up to allow for more memory. Newer models have been able to increase that context size using different methods. This increase in context size results in a corresponding increase in memory required, sometimes by orders of magnitude.
|
||||
|
||||
> !WARNING]
|
||||
> Currently, over-allocating context size may result in model quality or stability issues.
|
||||
|
||||
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.