Commit graph

62 commits

Author SHA1 Message Date
Jeffrey Morgan
fa9973cd7f
api: remove unused sampling parameters (#10581) 2025-05-08 08:31:08 -07:00
Daniel Hiltgen
424810450f
Move quantization to new backend (#10363)
* Move quantization logic to GGML via new backend

This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.

* Remove "add model quantizations"

This is no longer needed now that quantization is implemented in Go+GGML code directly.
2025-05-06 11:20:48 -07:00
Jeffrey Morgan
3b2d2c8326
api: remove unused or unsupported api options (#10574)
Some options listed in api/types.go are not supported in
newer models, or have been deprecated in the past. This is
the first of a series of PRs to clean up the API options
2025-05-05 14:54:40 -07:00
Michael Yang
d931ee8f22
create blobs in parallel (#10135)
* default max term height
* error on out of tree files
2025-05-05 11:59:26 -07:00
Michael Yang
16fca86c4a digest files in parallel 2025-04-07 09:46:31 -07:00
Bruce MacDonald
6bd0a983cd model: support for mistral-small in the ollama runner
Mistral is a popular research lab making open source models. This updates
the forward pass of llama architecture models to support both llama models
and mistral models by accounting for additional metadata present in mistral
models, and finding the correct dimensions for the output projection.
2025-04-03 16:57:36 -07:00
Parth Sareen
00ebda8cc4
Revert "parser: remove role validation from Modelfile parser" (#9917)
This reverts commit ffbfe833da.
2025-03-21 12:38:09 -07:00
rylativity
ffbfe833da
parser: remove role validation from Modelfile parser (#9874)
* updates parser/parser.go to allow arbitrary roles in Modelfile MESSAGE blocks
2025-03-20 13:11:17 -07:00
Michael Yang
58245413f4
next ollama runner (#7913)
feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-02-13 16:31:21 -08:00
frob
294b6f5a22
docs: remove tfs_z option from documentation (#8515) 2025-01-21 09:28:59 -08:00
Jeffrey Morgan
42cf4db601
parser: fix parsing Modelfiles with multiple FROM commands (#8449) 2025-01-16 00:14:04 -08:00
Patrick Devine
2539f2dbf9
Fix absolute path names + gguf detection (#8428) 2025-01-14 19:01:24 -08:00
Patrick Devine
32bd37adf8
make the modelfile path relative for ollama create (#8380) 2025-01-10 16:14:08 -08:00
Jeffrey Morgan
1deafd8254
llama: update vendored code to commit 46e3556 (#8308) 2025-01-08 11:22:01 -08:00
Patrick Devine
86a622cbdc
Update the /api/create endpoint to use JSON (#7935)
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
2024-12-31 18:02:30 -08:00
Stefan Weil
abfdc4710f
all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
Patrick Devine
4efb98cb4f
add line numbers for parser errors (#7326) 2024-11-14 13:59:44 -08:00
Jesse Gross
a909417602 runner.go: Remove unused arguments
Now that server.cpp is gone, we don't need to keep passing arguments
that were only ignored and only kept for compatibility.
2024-11-06 13:32:18 -08:00
Michael Yang
b732beba6a lint 2024-08-01 17:06:06 -07:00
Tibor Schmidt
f3d7a481b7
feat: add support for min_p (resolve #1142) (#1825) 2024-07-27 14:37:40 -07:00
Josh Yan
7e571f95f0 trimspace test case 2024-07-01 11:07:48 -07:00
Josh Yan
26e4e66faf updated parsefile test 2024-07-01 09:43:49 -07:00
Josh Yan
9bd00041fa trim all params 2024-06-27 11:18:38 -07:00
Josh Yan
4e986a823c unquote, trimp space 2024-06-27 10:59:15 -07:00
Michael Yang
d528e1af75 fix utf16 for multibyte runes 2024-06-13 13:07:42 -07:00
Michael Yang
cd234ce22c parser: add test for multibyte runes 2024-06-13 13:07:42 -07:00
Michael Yang
20b9f8e6f4 Revert "proper utf16 support"
This reverts commit 66ab48772f.

this change broke utf-8 scanning of multi-byte runes
2024-06-13 10:22:16 -07:00
Michael Yang
66ab48772f proper utf16 support 2024-06-05 13:11:50 -07:00
Michael Yang
e40145a39d lint 2024-06-04 11:13:30 -07:00
Patrick Devine
ccdf0b2a44
Move the parser back + handle utf16 files (#4533) 2024-05-20 11:26:45 -07:00
Michael Yang
119589fcb3 rename parser to model/file 2024-05-01 09:53:50 -07:00
Michael Yang
bd8eed57fc fix parser name 2024-05-01 09:52:54 -07:00
Michael Yang
9cf0f2e973 use parser.Format instead of templating modelfile 2024-05-01 09:52:54 -07:00
Michael Yang
176ad3aa6e parser: add commands format 2024-05-01 09:52:54 -07:00
Michael Yang
4d08363580 comments 2024-05-01 09:52:54 -07:00
Michael Yang
8907bf51d2 fix multiline 2024-05-01 09:52:54 -07:00
Michael Yang
abe614c705 tests 2024-05-01 09:52:54 -07:00
Michael Yang
238715037d linting 2024-05-01 09:52:54 -07:00
Michael Yang
c0a00f68ae refactor modelfile parser 2024-05-01 09:52:54 -07:00
Patrick Devine
7c40a67841
Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Daniel Hiltgen
fedd705aea Mechanical switch from log to slog
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Patrick Devine
238ac5e765
Add unit tests for Parser (#1815) 2024-01-05 14:04:31 -08:00
Michael Yang
38fe1a368b fix: trim space in modelfile fields 2023-12-05 11:57:29 -08:00
Bruce MacDonald
a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Michael Yang
6517bcc53c
Merge pull request #290 from jmorganca/add-adapter-layers
implement loading ggml lora adapters through the modelfile
2023-08-10 17:23:01 -07:00
Michael Yang
21e6197c0b
Merge pull request #322 from jmorganca/no-comment-warning
no warning on comments
2023-08-10 16:24:41 -07:00
Michael Yang
20bf000e55 no warning on comments 2023-08-10 16:22:38 -07:00
Michael Yang
40d0c4a1dc length check for parameters 2023-08-10 16:09:02 -07:00
Michael Yang
6de5d032e1 implement loading ggml lora adapters through the modelfile 2023-08-10 09:23:39 -07:00
Bruce MacDonald
a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00