mirror of
https://github.com/ollama/ollama.git
synced 2025-05-10 18:06:33 +02:00
Revert "increase default context length to 4096 (#10364)"
This reverts commit 424f648632
.
This commit is contained in:
parent
5cfc1c39f3
commit
dd93e1af85
7 changed files with 12 additions and 49 deletions
|
@ -20,7 +20,7 @@ Please refer to the [GPU docs](./gpu.md).
|
|||
|
||||
## How can I specify the context window size?
|
||||
|
||||
By default, Ollama uses a context window size of 4096 tokens, unless you have a single GPU with <= 4 GB of VRAM, in which case it will default to 2048 tokens.
|
||||
By default, Ollama uses a context window size of 2048 tokens.
|
||||
|
||||
This can be overridden with the `OLLAMA_CONTEXT_LENGTH` environment variable. For example, to set the default context window to 8K, use:
|
||||
|
||||
|
@ -31,7 +31,7 @@ OLLAMA_CONTEXT_LENGTH=8192 ollama serve
|
|||
To change this when using `ollama run`, use `/set parameter`:
|
||||
|
||||
```shell
|
||||
/set parameter num_ctx 8192
|
||||
/set parameter num_ctx 4096
|
||||
```
|
||||
|
||||
When using the API, specify the `num_ctx` parameter:
|
||||
|
@ -41,7 +41,7 @@ curl http://localhost:11434/api/generate -d '{
|
|||
"model": "llama3.2",
|
||||
"prompt": "Why is the sky blue?",
|
||||
"options": {
|
||||
"num_ctx": 8192
|
||||
"num_ctx": 4096
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue