mirror of
https://github.com/ollama/ollama.git
synced 2025-05-10 18:06:33 +02:00
docs: improve syntax highlighting in code blocks (#8854)
This commit is contained in:
parent
abb8dd57f8
commit
b901a712c6
16 changed files with 158 additions and 127 deletions
44
README.md
44
README.md
|
@ -18,7 +18,7 @@ Get up and running with large language models.
|
||||||
|
|
||||||
### Linux
|
### Linux
|
||||||
|
|
||||||
```
|
```shell
|
||||||
curl -fsSL https://ollama.com/install.sh | sh
|
curl -fsSL https://ollama.com/install.sh | sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -42,7 +42,7 @@ The official [Ollama Docker image](https://hub.docker.com/r/ollama/ollama) `olla
|
||||||
|
|
||||||
To run and chat with [Llama 3.2](https://ollama.com/library/llama3.2):
|
To run and chat with [Llama 3.2](https://ollama.com/library/llama3.2):
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama run llama3.2
|
ollama run llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -92,13 +92,13 @@ Ollama supports importing GGUF models in the Modelfile:
|
||||||
|
|
||||||
2. Create the model in Ollama
|
2. Create the model in Ollama
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama create example -f Modelfile
|
ollama create example -f Modelfile
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Run the model
|
3. Run the model
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama run example
|
ollama run example
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -110,7 +110,7 @@ See the [guide](docs/import.md) on importing models for more information.
|
||||||
|
|
||||||
Models from the Ollama library can be customized with a prompt. For example, to customize the `llama3.2` model:
|
Models from the Ollama library can be customized with a prompt. For example, to customize the `llama3.2` model:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama pull llama3.2
|
ollama pull llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -145,13 +145,13 @@ For more information on working with a Modelfile, see the [Modelfile](docs/model
|
||||||
|
|
||||||
`ollama create` is used to create a model from a Modelfile.
|
`ollama create` is used to create a model from a Modelfile.
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama create mymodel -f ./Modelfile
|
ollama create mymodel -f ./Modelfile
|
||||||
```
|
```
|
||||||
|
|
||||||
### Pull a model
|
### Pull a model
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama pull llama3.2
|
ollama pull llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -159,13 +159,13 @@ ollama pull llama3.2
|
||||||
|
|
||||||
### Remove a model
|
### Remove a model
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama rm llama3.2
|
ollama rm llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
### Copy a model
|
### Copy a model
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama cp llama3.2 my-model
|
ollama cp llama3.2 my-model
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -184,37 +184,39 @@ I'm a basic program that prints the famous "Hello, world!" message to the consol
|
||||||
|
|
||||||
```
|
```
|
||||||
ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
|
ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
|
||||||
The image features a yellow smiley face, which is likely the central focus of the picture.
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
> **Output**: The image features a yellow smiley face, which is likely the central focus of the picture.
|
||||||
|
|
||||||
### Pass the prompt as an argument
|
### Pass the prompt as an argument
|
||||||
|
|
||||||
|
```shell
|
||||||
|
ollama run llama3.2 "Summarize this file: $(cat README.md)"
|
||||||
```
|
```
|
||||||
$ ollama run llama3.2 "Summarize this file: $(cat README.md)"
|
|
||||||
Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
|
> **Output**: Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
|
||||||
```
|
|
||||||
|
|
||||||
### Show model information
|
### Show model information
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama show llama3.2
|
ollama show llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
### List models on your computer
|
### List models on your computer
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama list
|
ollama list
|
||||||
```
|
```
|
||||||
|
|
||||||
### List which models are currently loaded
|
### List which models are currently loaded
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama ps
|
ollama ps
|
||||||
```
|
```
|
||||||
|
|
||||||
### Stop a model which is currently running
|
### Stop a model which is currently running
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama stop llama3.2
|
ollama stop llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -230,13 +232,13 @@ See the [developer guide](https://github.com/ollama/ollama/blob/main/docs/develo
|
||||||
|
|
||||||
Next, start the server:
|
Next, start the server:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
./ollama serve
|
./ollama serve
|
||||||
```
|
```
|
||||||
|
|
||||||
Finally, in a separate shell, run a model:
|
Finally, in a separate shell, run a model:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
./ollama run llama3.2
|
./ollama run llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -246,7 +248,7 @@ Ollama has a REST API for running and managing models.
|
||||||
|
|
||||||
### Generate a response
|
### Generate a response
|
||||||
|
|
||||||
```
|
```shell
|
||||||
curl http://localhost:11434/api/generate -d '{
|
curl http://localhost:11434/api/generate -d '{
|
||||||
"model": "llama3.2",
|
"model": "llama3.2",
|
||||||
"prompt":"Why is the sky blue?"
|
"prompt":"Why is the sky blue?"
|
||||||
|
@ -255,7 +257,7 @@ curl http://localhost:11434/api/generate -d '{
|
||||||
|
|
||||||
### Chat with a model
|
### Chat with a model
|
||||||
|
|
||||||
```
|
```shell
|
||||||
curl http://localhost:11434/api/chat -d '{
|
curl http://localhost:11434/api/chat -d '{
|
||||||
"model": "llama3.2",
|
"model": "llama3.2",
|
||||||
"messages": [
|
"messages": [
|
||||||
|
|
|
@ -2,9 +2,10 @@
|
||||||
|
|
||||||
Run the examples in this directory with:
|
Run the examples in this directory with:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
go run example_name/main.go
|
go run example_name/main.go
|
||||||
```
|
```
|
||||||
|
|
||||||
## Chat - Chat with a model
|
## Chat - Chat with a model
|
||||||
- [chat/main.go](chat/main.go)
|
- [chat/main.go](chat/main.go)
|
||||||
|
|
||||||
|
|
|
@ -17,6 +17,6 @@ If you want to build the installer, youll need to install
|
||||||
In the top directory of this repo, run the following powershell script
|
In the top directory of this repo, run the following powershell script
|
||||||
to build the ollama CLI, ollama app, and ollama installer.
|
to build the ollama CLI, ollama app, and ollama installer.
|
||||||
|
|
||||||
```
|
```powershell
|
||||||
powershell -ExecutionPolicy Bypass -File .\scripts\build_windows.ps1
|
powershell -ExecutionPolicy Bypass -File .\scripts\build_windows.ps1
|
||||||
```
|
```
|
||||||
|
|
33
docs/api.md
33
docs/api.md
|
@ -31,7 +31,7 @@ Certain endpoints stream responses as JSON objects. Streaming can be disabled by
|
||||||
|
|
||||||
## Generate a completion
|
## Generate a completion
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/generate
|
POST /api/generate
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -485,7 +485,7 @@ A single JSON object is returned:
|
||||||
|
|
||||||
## Generate a chat completion
|
## Generate a chat completion
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/chat
|
POST /api/chat
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -878,6 +878,7 @@ curl http://localhost:11434/api/chat -d '{
|
||||||
```
|
```
|
||||||
|
|
||||||
##### Response
|
##### Response
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"model": "llama3.2",
|
"model": "llama3.2",
|
||||||
|
@ -924,7 +925,7 @@ A single JSON object is returned:
|
||||||
|
|
||||||
## Create a Model
|
## Create a Model
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/create
|
POST /api/create
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1020,7 +1021,7 @@ curl http://localhost:11434/api/create -d '{
|
||||||
|
|
||||||
A stream of JSON objects is returned:
|
A stream of JSON objects is returned:
|
||||||
|
|
||||||
```
|
```json
|
||||||
{"status":"quantizing F16 model to Q4_K_M"}
|
{"status":"quantizing F16 model to Q4_K_M"}
|
||||||
{"status":"creating new layer sha256:667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29"}
|
{"status":"creating new layer sha256:667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29"}
|
||||||
{"status":"using existing layer sha256:11ce4ee3e170f6adebac9a991c22e22ab3f8530e154ee669954c4bc73061c258"}
|
{"status":"using existing layer sha256:11ce4ee3e170f6adebac9a991c22e22ab3f8530e154ee669954c4bc73061c258"}
|
||||||
|
@ -1051,7 +1052,7 @@ curl http://localhost:11434/api/create -d '{
|
||||||
|
|
||||||
A stream of JSON objects is returned:
|
A stream of JSON objects is returned:
|
||||||
|
|
||||||
```
|
```json
|
||||||
{"status":"parsing GGUF"}
|
{"status":"parsing GGUF"}
|
||||||
{"status":"using existing layer sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"}
|
{"status":"using existing layer sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"}
|
||||||
{"status":"writing manifest"}
|
{"status":"writing manifest"}
|
||||||
|
@ -1118,7 +1119,7 @@ Return 200 OK if the blob exists, 404 Not Found if it does not.
|
||||||
|
|
||||||
## Push a Blob
|
## Push a Blob
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/blobs/:digest
|
POST /api/blobs/:digest
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1142,7 +1143,7 @@ Return 201 Created if the blob was successfully created, 400 Bad Request if the
|
||||||
|
|
||||||
## List Local Models
|
## List Local Models
|
||||||
|
|
||||||
```shell
|
```
|
||||||
GET /api/tags
|
GET /api/tags
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1195,7 +1196,7 @@ A single JSON object will be returned.
|
||||||
|
|
||||||
## Show Model Information
|
## Show Model Information
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/show
|
POST /api/show
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1261,7 +1262,7 @@ curl http://localhost:11434/api/show -d '{
|
||||||
|
|
||||||
## Copy a Model
|
## Copy a Model
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/copy
|
POST /api/copy
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1284,7 +1285,7 @@ Returns a 200 OK if successful, or a 404 Not Found if the source model doesn't e
|
||||||
|
|
||||||
## Delete a Model
|
## Delete a Model
|
||||||
|
|
||||||
```shell
|
```
|
||||||
DELETE /api/delete
|
DELETE /api/delete
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1310,7 +1311,7 @@ Returns a 200 OK if successful, 404 Not Found if the model to be deleted doesn't
|
||||||
|
|
||||||
## Pull a Model
|
## Pull a Model
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/pull
|
POST /api/pull
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1382,7 +1383,7 @@ if `stream` is set to false, then the response is a single JSON object:
|
||||||
|
|
||||||
## Push a Model
|
## Push a Model
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/push
|
POST /api/push
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1447,7 +1448,7 @@ If `stream` is set to `false`, then the response is a single JSON object:
|
||||||
|
|
||||||
## Generate Embeddings
|
## Generate Embeddings
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/embed
|
POST /api/embed
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1515,7 +1516,7 @@ curl http://localhost:11434/api/embed -d '{
|
||||||
```
|
```
|
||||||
|
|
||||||
## List Running Models
|
## List Running Models
|
||||||
```shell
|
```
|
||||||
GET /api/ps
|
GET /api/ps
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1562,7 +1563,7 @@ A single JSON object will be returned.
|
||||||
|
|
||||||
> Note: this endpoint has been superseded by `/api/embed`
|
> Note: this endpoint has been superseded by `/api/embed`
|
||||||
|
|
||||||
```shell
|
```
|
||||||
POST /api/embeddings
|
POST /api/embeddings
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1602,7 +1603,7 @@ curl http://localhost:11434/api/embeddings -d '{
|
||||||
|
|
||||||
## Version
|
## Version
|
||||||
|
|
||||||
```shell
|
```
|
||||||
GET /api/version
|
GET /api/version
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
@ -7,7 +7,7 @@ Install prerequisites:
|
||||||
|
|
||||||
Then build and run Ollama from the root directory of the repository:
|
Then build and run Ollama from the root directory of the repository:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
go run . serve
|
go run . serve
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -23,14 +23,14 @@ Install prerequisites:
|
||||||
|
|
||||||
Then, configure and build the project:
|
Then, configure and build the project:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
cmake -B build
|
cmake -B build
|
||||||
cmake --build build
|
cmake --build build
|
||||||
```
|
```
|
||||||
|
|
||||||
Lastly, run Ollama:
|
Lastly, run Ollama:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
go run . serve
|
go run . serve
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -57,14 +57,14 @@ Install prerequisites:
|
||||||
|
|
||||||
Then, configure and build the project:
|
Then, configure and build the project:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
cmake -B build
|
cmake -B build
|
||||||
cmake --build build --config Release
|
cmake --build build --config Release
|
||||||
```
|
```
|
||||||
|
|
||||||
Lastly, run Ollama:
|
Lastly, run Ollama:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
go run . serve
|
go run . serve
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -88,26 +88,26 @@ Install prerequisites:
|
||||||
|
|
||||||
Then, configure and build the project:
|
Then, configure and build the project:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
cmake -B build
|
cmake -B build
|
||||||
cmake --build build
|
cmake --build build
|
||||||
```
|
```
|
||||||
|
|
||||||
Lastly, run Ollama:
|
Lastly, run Ollama:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
go run . serve
|
go run . serve
|
||||||
```
|
```
|
||||||
|
|
||||||
## Docker
|
## Docker
|
||||||
|
|
||||||
```
|
```shell
|
||||||
docker build .
|
docker build .
|
||||||
```
|
```
|
||||||
|
|
||||||
### ROCm
|
### ROCm
|
||||||
|
|
||||||
```
|
```shell
|
||||||
docker build --build-arg FLAVOR=rocm .
|
docker build --build-arg FLAVOR=rocm .
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -115,7 +115,7 @@ docker build --build-arg FLAVOR=rocm .
|
||||||
|
|
||||||
To run tests, use `go test`:
|
To run tests, use `go test`:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
go test ./...
|
go test ./...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
### CPU only
|
### CPU only
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
|
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -11,42 +11,46 @@ Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-
|
||||||
|
|
||||||
#### Install with Apt
|
#### Install with Apt
|
||||||
1. Configure the repository
|
1. Configure the repository
|
||||||
```bash
|
|
||||||
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
|
```shell
|
||||||
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
|
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
|
||||||
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
|
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
|
||||||
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
|
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
|
||||||
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
|
||||||
sudo apt-get update
|
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
||||||
```
|
sudo apt-get update
|
||||||
|
```
|
||||||
|
|
||||||
2. Install the NVIDIA Container Toolkit packages
|
2. Install the NVIDIA Container Toolkit packages
|
||||||
```bash
|
|
||||||
sudo apt-get install -y nvidia-container-toolkit
|
```shell
|
||||||
```
|
sudo apt-get install -y nvidia-container-toolkit
|
||||||
|
```
|
||||||
|
|
||||||
#### Install with Yum or Dnf
|
#### Install with Yum or Dnf
|
||||||
1. Configure the repository
|
1. Configure the repository
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \
|
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \
|
||||||
| sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
|
| sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Install the NVIDIA Container Toolkit packages
|
2. Install the NVIDIA Container Toolkit packages
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
sudo yum install -y nvidia-container-toolkit
|
sudo yum install -y nvidia-container-toolkit
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Configure Docker to use Nvidia driver
|
#### Configure Docker to use Nvidia driver
|
||||||
```
|
|
||||||
|
```shell
|
||||||
sudo nvidia-ctk runtime configure --runtime=docker
|
sudo nvidia-ctk runtime configure --runtime=docker
|
||||||
sudo systemctl restart docker
|
sudo systemctl restart docker
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Start the container
|
#### Start the container
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
|
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -57,7 +61,7 @@ docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ol
|
||||||
|
|
||||||
To run Ollama using Docker with AMD GPUs, use the `rocm` tag and the following command:
|
To run Ollama using Docker with AMD GPUs, use the `rocm` tag and the following command:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
|
docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -65,7 +69,7 @@ docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 114
|
||||||
|
|
||||||
Now you can run a model:
|
Now you can run a model:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
docker exec -it ollama ollama run llama3.2
|
docker exec -it ollama ollama run llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
18
docs/faq.md
18
docs/faq.md
|
@ -24,7 +24,7 @@ By default, Ollama uses a context window size of 2048 tokens.
|
||||||
|
|
||||||
To change this when using `ollama run`, use `/set parameter`:
|
To change this when using `ollama run`, use `/set parameter`:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
/set parameter num_ctx 4096
|
/set parameter num_ctx 4096
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -46,10 +46,15 @@ Use the `ollama ps` command to see what models are currently loaded into memory.
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
ollama ps
|
ollama ps
|
||||||
NAME ID SIZE PROCESSOR UNTIL
|
|
||||||
llama3:70b bcfb190ca3a7 42 GB 100% GPU 4 minutes from now
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
> **Output**:
|
||||||
|
>
|
||||||
|
> ```
|
||||||
|
> NAME ID SIZE PROCESSOR UNTIL
|
||||||
|
> llama3:70b bcfb190ca3a7 42 GB 100% GPU 4 minutes from now
|
||||||
|
> ```
|
||||||
|
|
||||||
The `Processor` column will show which memory the model was loaded in to:
|
The `Processor` column will show which memory the model was loaded in to:
|
||||||
* `100% GPU` means the model was loaded entirely into the GPU
|
* `100% GPU` means the model was loaded entirely into the GPU
|
||||||
* `100% CPU` means the model was loaded entirely in system memory
|
* `100% CPU` means the model was loaded entirely in system memory
|
||||||
|
@ -88,7 +93,7 @@ If Ollama is run as a systemd service, environment variables should be set using
|
||||||
|
|
||||||
4. Reload `systemd` and restart Ollama:
|
4. Reload `systemd` and restart Ollama:
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
systemctl daemon-reload
|
systemctl daemon-reload
|
||||||
systemctl restart ollama
|
systemctl restart ollama
|
||||||
```
|
```
|
||||||
|
@ -221,16 +226,19 @@ properties.
|
||||||
If you are using the API you can preload a model by sending the Ollama server an empty request. This works with both the `/api/generate` and `/api/chat` API endpoints.
|
If you are using the API you can preload a model by sending the Ollama server an empty request. This works with both the `/api/generate` and `/api/chat` API endpoints.
|
||||||
|
|
||||||
To preload the mistral model using the generate endpoint, use:
|
To preload the mistral model using the generate endpoint, use:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
curl http://localhost:11434/api/generate -d '{"model": "mistral"}'
|
curl http://localhost:11434/api/generate -d '{"model": "mistral"}'
|
||||||
```
|
```
|
||||||
|
|
||||||
To use the chat completions endpoint, use:
|
To use the chat completions endpoint, use:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
curl http://localhost:11434/api/chat -d '{"model": "mistral"}'
|
curl http://localhost:11434/api/chat -d '{"model": "mistral"}'
|
||||||
```
|
```
|
||||||
|
|
||||||
To preload a model using the CLI, use the command:
|
To preload a model using the CLI, use the command:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
ollama run llama3.2 ""
|
ollama run llama3.2 ""
|
||||||
```
|
```
|
||||||
|
@ -250,11 +258,13 @@ If you're using the API, use the `keep_alive` parameter with the `/api/generate`
|
||||||
* '0' which will unload the model immediately after generating a response
|
* '0' which will unload the model immediately after generating a response
|
||||||
|
|
||||||
For example, to preload a model and leave it in memory use:
|
For example, to preload a model and leave it in memory use:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": -1}'
|
curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": -1}'
|
||||||
```
|
```
|
||||||
|
|
||||||
To unload the model and free up memory use:
|
To unload the model and free up memory use:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": 0}'
|
curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": 0}'
|
||||||
```
|
```
|
||||||
|
|
|
@ -20,13 +20,13 @@ Make sure that you use the same base model in the `FROM` command as you used to
|
||||||
|
|
||||||
Now run `ollama create` from the directory where the `Modelfile` was created:
|
Now run `ollama create` from the directory where the `Modelfile` was created:
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
ollama create my-model
|
ollama create my-model
|
||||||
```
|
```
|
||||||
|
|
||||||
Lastly, test the model:
|
Lastly, test the model:
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
ollama run my-model
|
ollama run my-model
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
@ -119,7 +119,7 @@ sudo systemctl status ollama
|
||||||
|
|
||||||
To customize the installation of Ollama, you can edit the systemd service file or the environment variables by running:
|
To customize the installation of Ollama, you can edit the systemd service file or the environment variables by running:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
sudo systemctl edit ollama
|
sudo systemctl edit ollama
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
@ -28,7 +28,7 @@ A model file is the blueprint to create and share models with Ollama.
|
||||||
|
|
||||||
The format of the `Modelfile`:
|
The format of the `Modelfile`:
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
# comment
|
# comment
|
||||||
INSTRUCTION arguments
|
INSTRUCTION arguments
|
||||||
```
|
```
|
||||||
|
@ -49,7 +49,7 @@ INSTRUCTION arguments
|
||||||
|
|
||||||
An example of a `Modelfile` creating a mario blueprint:
|
An example of a `Modelfile` creating a mario blueprint:
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
FROM llama3.2
|
FROM llama3.2
|
||||||
# sets the temperature to 1 [higher is more creative, lower is more coherent]
|
# sets the temperature to 1 [higher is more creative, lower is more coherent]
|
||||||
PARAMETER temperature 1
|
PARAMETER temperature 1
|
||||||
|
@ -69,24 +69,30 @@ To use this:
|
||||||
|
|
||||||
To view the Modelfile of a given model, use the `ollama show --modelfile` command.
|
To view the Modelfile of a given model, use the `ollama show --modelfile` command.
|
||||||
|
|
||||||
```bash
|
```shell
|
||||||
> ollama show --modelfile llama3.2
|
ollama show --modelfile llama3.2
|
||||||
# Modelfile generated by "ollama show"
|
```
|
||||||
# To build a new Modelfile based on this one, replace the FROM line with:
|
|
||||||
# FROM llama3.2:latest
|
|
||||||
FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
|
|
||||||
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
|
|
||||||
|
|
||||||
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
|
> **Output**:
|
||||||
|
>
|
||||||
|
> ```
|
||||||
|
> # Modelfile generated by "ollama show"
|
||||||
|
> # To build a new Modelfile based on this one, replace the FROM line with:
|
||||||
|
> # FROM llama3.2:latest
|
||||||
|
> FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
|
||||||
|
> TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
|
||||||
|
>
|
||||||
|
> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
|
||||||
|
>
|
||||||
|
> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
|
||||||
|
>
|
||||||
|
> {{ .Response }}<|eot_id|>"""
|
||||||
|
> PARAMETER stop "<|start_header_id|>"
|
||||||
|
> PARAMETER stop "<|end_header_id|>"
|
||||||
|
> PARAMETER stop "<|eot_id|>"
|
||||||
|
> PARAMETER stop "<|reserved_special_token"
|
||||||
|
> ```
|
||||||
|
|
||||||
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
|
|
||||||
|
|
||||||
{{ .Response }}<|eot_id|>"""
|
|
||||||
PARAMETER stop "<|start_header_id|>"
|
|
||||||
PARAMETER stop "<|end_header_id|>"
|
|
||||||
PARAMETER stop "<|eot_id|>"
|
|
||||||
PARAMETER stop "<|reserved_special_token"
|
|
||||||
```
|
|
||||||
|
|
||||||
## Instructions
|
## Instructions
|
||||||
|
|
||||||
|
@ -94,13 +100,13 @@ To view the Modelfile of a given model, use the `ollama show --modelfile` comman
|
||||||
|
|
||||||
The `FROM` instruction defines the base model to use when creating a model.
|
The `FROM` instruction defines the base model to use when creating a model.
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
FROM <model name>:<tag>
|
FROM <model name>:<tag>
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Build from existing model
|
#### Build from existing model
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
FROM llama3.2
|
FROM llama3.2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -111,7 +117,7 @@ Additional models can be found at:
|
||||||
|
|
||||||
#### Build from a Safetensors model
|
#### Build from a Safetensors model
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
FROM <model directory>
|
FROM <model directory>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -125,7 +131,7 @@ Currently supported model architectures:
|
||||||
|
|
||||||
#### Build from a GGUF file
|
#### Build from a GGUF file
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
FROM ./ollama-model.gguf
|
FROM ./ollama-model.gguf
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -136,7 +142,7 @@ The GGUF file location should be specified as an absolute path or relative to th
|
||||||
|
|
||||||
The `PARAMETER` instruction defines a parameter that can be set when the model is run.
|
The `PARAMETER` instruction defines a parameter that can be set when the model is run.
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
PARAMETER <parameter> <parametervalue>
|
PARAMETER <parameter> <parametervalue>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -183,7 +189,7 @@ TEMPLATE """{{ if .System }}<|im_start|>system
|
||||||
|
|
||||||
The `SYSTEM` instruction specifies the system message to be used in the template, if applicable.
|
The `SYSTEM` instruction specifies the system message to be used in the template, if applicable.
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
SYSTEM """<system message>"""
|
SYSTEM """<system message>"""
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -193,7 +199,7 @@ The `ADAPTER` instruction specifies a fine tuned LoRA adapter that should apply
|
||||||
|
|
||||||
#### Safetensor adapter
|
#### Safetensor adapter
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
ADAPTER <path to safetensor adapter>
|
ADAPTER <path to safetensor adapter>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -204,7 +210,7 @@ Currently supported Safetensor adapters:
|
||||||
|
|
||||||
#### GGUF adapter
|
#### GGUF adapter
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
ADAPTER ./ollama-lora.gguf
|
ADAPTER ./ollama-lora.gguf
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -212,7 +218,7 @@ ADAPTER ./ollama-lora.gguf
|
||||||
|
|
||||||
The `LICENSE` instruction allows you to specify the legal license under which the model used with this Modelfile is shared or distributed.
|
The `LICENSE` instruction allows you to specify the legal license under which the model used with this Modelfile is shared or distributed.
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
LICENSE """
|
LICENSE """
|
||||||
<license text>
|
<license text>
|
||||||
"""
|
"""
|
||||||
|
@ -222,7 +228,7 @@ LICENSE """
|
||||||
|
|
||||||
The `MESSAGE` instruction allows you to specify a message history for the model to use when responding. Use multiple iterations of the MESSAGE command to build up a conversation which will guide the model to answer in a similar way.
|
The `MESSAGE` instruction allows you to specify a message history for the model to use when responding. Use multiple iterations of the MESSAGE command to build up a conversation which will guide the model to answer in a similar way.
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
MESSAGE <role> <message>
|
MESSAGE <role> <message>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -237,7 +243,7 @@ MESSAGE <role> <message>
|
||||||
|
|
||||||
#### Example conversation
|
#### Example conversation
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
MESSAGE user Is Toronto in Canada?
|
MESSAGE user Is Toronto in Canada?
|
||||||
MESSAGE assistant yes
|
MESSAGE assistant yes
|
||||||
MESSAGE user Is Sacramento in Canada?
|
MESSAGE user Is Sacramento in Canada?
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
# OpenAI compatibility
|
# OpenAI compatibility
|
||||||
|
|
||||||
> **Note:** OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama [Python library](https://github.com/ollama/ollama-python), [JavaScript library](https://github.com/ollama/ollama-js) and [REST API](https://github.com/ollama/ollama/blob/main/docs/api.md).
|
> [!NOTE]
|
||||||
|
> OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama [Python library](https://github.com/ollama/ollama-python), [JavaScript library](https://github.com/ollama/ollama-js) and [REST API](https://github.com/ollama/ollama/blob/main/docs/api.md).
|
||||||
|
|
||||||
Ollama provides experimental compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
|
Ollama provides experimental compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
|
||||||
|
|
||||||
|
@ -59,8 +60,10 @@ embeddings = client.embeddings.create(
|
||||||
input=["why is the sky blue?", "why is the grass green?"],
|
input=["why is the sky blue?", "why is the grass green?"],
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Structured outputs
|
#### Structured outputs
|
||||||
```py
|
|
||||||
|
```python
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel
|
||||||
from openai import OpenAI
|
from openai import OpenAI
|
||||||
|
|
||||||
|
@ -144,7 +147,7 @@ const embedding = await openai.embeddings.create({
|
||||||
|
|
||||||
### `curl`
|
### `curl`
|
||||||
|
|
||||||
``` shell
|
```shell
|
||||||
curl http://localhost:11434/v1/chat/completions \
|
curl http://localhost:11434/v1/chat/completions \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{
|
-d '{
|
||||||
|
@ -319,7 +322,7 @@ ollama pull llama3.2
|
||||||
|
|
||||||
For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:
|
For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
ollama cp llama3.2 gpt-3.5-turbo
|
ollama cp llama3.2 gpt-3.5-turbo
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -343,7 +346,7 @@ curl http://localhost:11434/v1/chat/completions \
|
||||||
|
|
||||||
The OpenAI API does not have a way of setting the context size for a model. If you need to change the context size, create a `Modelfile` which looks like:
|
The OpenAI API does not have a way of setting the context size for a model. If you need to change the context size, create a `Modelfile` which looks like:
|
||||||
|
|
||||||
```modelfile
|
```
|
||||||
FROM <some model>
|
FROM <some model>
|
||||||
PARAMETER num_ctx <context size>
|
PARAMETER num_ctx <context size>
|
||||||
```
|
```
|
||||||
|
|
|
@ -17,6 +17,7 @@ When you run Ollama in a **container**, the logs go to stdout/stderr in the cont
|
||||||
```shell
|
```shell
|
||||||
docker logs <container-name>
|
docker logs <container-name>
|
||||||
```
|
```
|
||||||
|
|
||||||
(Use `docker ps` to find the container name)
|
(Use `docker ps` to find the container name)
|
||||||
|
|
||||||
If manually running `ollama serve` in a terminal, the logs will be on that terminal.
|
If manually running `ollama serve` in a terminal, the logs will be on that terminal.
|
||||||
|
@ -28,6 +29,7 @@ When you run Ollama on **Windows**, there are a few different locations. You can
|
||||||
- `explorer %TEMP%` where temporary executable files are stored in one or more `ollama*` directories
|
- `explorer %TEMP%` where temporary executable files are stored in one or more `ollama*` directories
|
||||||
|
|
||||||
To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal
|
To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal
|
||||||
|
|
||||||
```powershell
|
```powershell
|
||||||
$env:OLLAMA_DEBUG="1"
|
$env:OLLAMA_DEBUG="1"
|
||||||
& "ollama app.exe"
|
& "ollama app.exe"
|
||||||
|
@ -49,12 +51,13 @@ Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]
|
||||||
|
|
||||||
You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:
|
You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
|
OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
|
||||||
```
|
```
|
||||||
|
|
||||||
You can see what features your CPU has with the following.
|
You can see what features your CPU has with the following.
|
||||||
```
|
|
||||||
|
```shell
|
||||||
cat /proc/cpuinfo| grep flags | head -1
|
cat /proc/cpuinfo| grep flags | head -1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -62,8 +65,8 @@ cat /proc/cpuinfo| grep flags | head -1
|
||||||
|
|
||||||
If you run into problems on Linux and want to install an older version, or you'd like to try out a pre-release before it's officially released, you can tell the install script which version to install.
|
If you run into problems on Linux and want to install an older version, or you'd like to try out a pre-release before it's officially released, you can tell the install script which version to install.
|
||||||
|
|
||||||
```sh
|
```shell
|
||||||
curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION="0.1.29" sh
|
curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.5.7 sh
|
||||||
```
|
```
|
||||||
|
|
||||||
## Linux tmp noexec
|
## Linux tmp noexec
|
||||||
|
|
|
@ -47,6 +47,7 @@ If Ollama is already running, Quit the tray application and relaunch it from the
|
||||||
## API Access
|
## API Access
|
||||||
|
|
||||||
Here's a quick example showing API access from `powershell`
|
Here's a quick example showing API access from `powershell`
|
||||||
|
|
||||||
```powershell
|
```powershell
|
||||||
(Invoke-WebRequest -method POST -Body '{"model":"llama3.2", "prompt":"Why is the sky blue?", "stream": false}' -uri http://localhost:11434/api/generate ).Content | ConvertFrom-json
|
(Invoke-WebRequest -method POST -Body '{"model":"llama3.2", "prompt":"Why is the sky blue?", "stream": false}' -uri http://localhost:11434/api/generate ).Content | ConvertFrom-json
|
||||||
```
|
```
|
||||||
|
|
|
@ -8,7 +8,7 @@ Ollama vendors [llama.cpp](https://github.com/ggerganov/llama.cpp/) and [ggml](h
|
||||||
|
|
||||||
If you update the vendoring code, start by running the following command to establish the tracking llama.cpp repo in the `./vendor/` directory.
|
If you update the vendoring code, start by running the following command to establish the tracking llama.cpp repo in the `./vendor/` directory.
|
||||||
|
|
||||||
```
|
```shell
|
||||||
make -f Makefile.sync apply-patches
|
make -f Makefile.sync apply-patches
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -22,7 +22,7 @@ When updating to a newer base commit, the existing patches may not apply cleanly
|
||||||
|
|
||||||
Start by applying the patches. If any of the patches have conflicts, the `git am` will stop at the first failure.
|
Start by applying the patches. If any of the patches have conflicts, the `git am` will stop at the first failure.
|
||||||
|
|
||||||
```
|
```shell
|
||||||
make -f Makefile.sync apply-patches
|
make -f Makefile.sync apply-patches
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -30,7 +30,7 @@ If there are conflicts, you will see an error message. Resolve the conflicts in
|
||||||
|
|
||||||
Once all patches are applied, commit the changes to the tracking repository.
|
Once all patches are applied, commit the changes to the tracking repository.
|
||||||
|
|
||||||
```
|
```shell
|
||||||
make -f Makefile.sync format-patches sync
|
make -f Makefile.sync format-patches sync
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -38,13 +38,13 @@ make -f Makefile.sync format-patches sync
|
||||||
|
|
||||||
When working on new fixes or features that impact vendored code, use the following model. First get a clean tracking repo with all current patches applied:
|
When working on new fixes or features that impact vendored code, use the following model. First get a clean tracking repo with all current patches applied:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
make -f Makefile.sync clean apply-patches
|
make -f Makefile.sync clean apply-patches
|
||||||
```
|
```
|
||||||
|
|
||||||
Iterate until you're ready to submit PRs. Once your code is ready, commit a change in the `./vendor/` directory, then generate the patches for ollama with
|
Iterate until you're ready to submit PRs. Once your code is ready, commit a change in the `./vendor/` directory, then generate the patches for ollama with
|
||||||
|
|
||||||
```
|
```shell
|
||||||
make -f Makefile.sync format-patches
|
make -f Makefile.sync format-patches
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
@ -4,18 +4,18 @@
|
||||||
|
|
||||||
A minimial runner for loading a model and running inference via a http web server.
|
A minimial runner for loading a model and running inference via a http web server.
|
||||||
|
|
||||||
```
|
```shell
|
||||||
./runner -model <model binary>
|
./runner -model <model binary>
|
||||||
```
|
```
|
||||||
|
|
||||||
### Completion
|
### Completion
|
||||||
|
|
||||||
```
|
```shell
|
||||||
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "hi"}' http://localhost:8080/completion
|
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "hi"}' http://localhost:8080/completion
|
||||||
```
|
```
|
||||||
|
|
||||||
### Embeddings
|
### Embeddings
|
||||||
|
|
||||||
```
|
```shell
|
||||||
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "turn me into an embedding"}' http://localhost:8080/embedding
|
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "turn me into an embedding"}' http://localhost:8080/embedding
|
||||||
```
|
```
|
||||||
|
|
|
@ -6,14 +6,14 @@ This app builds upon Ollama to provide a desktop experience for running models.
|
||||||
|
|
||||||
First, build the `ollama` binary:
|
First, build the `ollama` binary:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
cd ..
|
cd ..
|
||||||
go build .
|
go build .
|
||||||
```
|
```
|
||||||
|
|
||||||
Then run the desktop app with `npm start`:
|
Then run the desktop app with `npm start`:
|
||||||
|
|
||||||
```
|
```shell
|
||||||
cd macapp
|
cd macapp
|
||||||
npm install
|
npm install
|
||||||
npm start
|
npm start
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue