Michael Yang
16fca86c4a
digest files in parallel
2025-04-07 09:46:31 -07:00
Bruce MacDonald
6bd0a983cd
model: support for mistral-small in the ollama runner
...
Mistral is a popular research lab making open source models. This updates
the forward pass of llama architecture models to support both llama models
and mistral models by accounting for additional metadata present in mistral
models, and finding the correct dimensions for the output projection.
2025-04-03 16:57:36 -07:00
Parth Sareen
00ebda8cc4
Revert "parser: remove role validation from Modelfile parser" ( #9917 )
...
This reverts commit ffbfe833da
.
2025-03-21 12:38:09 -07:00
rylativity
ffbfe833da
parser: remove role validation from Modelfile parser ( #9874 )
...
* updates parser/parser.go to allow arbitrary roles in Modelfile MESSAGE blocks
2025-03-20 13:11:17 -07:00
Michael Yang
58245413f4
next ollama runner ( #7913 )
...
feat: add new Ollama engine using ggml through cgo
This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.
- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations
This is the first implementation of the new engine. Follow up PRs will implement more features:
- non-greedy sampling (#8410 )
- integration with Ollama and KV caching (#8301 )
- more model support (#9080 ) with more coming soon
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-02-13 16:31:21 -08:00
frob
294b6f5a22
docs: remove tfs_z option from documentation ( #8515 )
2025-01-21 09:28:59 -08:00
Jeffrey Morgan
42cf4db601
parser: fix parsing Modelfiles with multiple FROM commands ( #8449 )
2025-01-16 00:14:04 -08:00
Patrick Devine
2539f2dbf9
Fix absolute path names + gguf detection ( #8428 )
2025-01-14 19:01:24 -08:00
Patrick Devine
32bd37adf8
make the modelfile path relative for ollama create
( #8380 )
2025-01-10 16:14:08 -08:00
Jeffrey Morgan
1deafd8254
llama: update vendored code to commit 46e3556 ( #8308 )
2025-01-08 11:22:01 -08:00
Patrick Devine
86a622cbdc
Update the /api/create endpoint to use JSON ( #7935 )
...
Replaces `POST /api/create` to use JSON instead of a Modelfile.
This is a breaking change.
2024-12-31 18:02:30 -08:00
Stefan Weil
abfdc4710f
all: fix typos in documentation, code, and comments ( #7021 )
2024-12-10 12:58:06 -08:00
Patrick Devine
4efb98cb4f
add line numbers for parser errors ( #7326 )
2024-11-14 13:59:44 -08:00
Jesse Gross
a909417602
runner.go: Remove unused arguments
...
Now that server.cpp is gone, we don't need to keep passing arguments
that were only ignored and only kept for compatibility.
2024-11-06 13:32:18 -08:00
Michael Yang
b732beba6a
lint
2024-08-01 17:06:06 -07:00
Tibor Schmidt
f3d7a481b7
feat: add support for min_p ( resolve #1142 ) ( #1825 )
2024-07-27 14:37:40 -07:00
Josh Yan
7e571f95f0
trimspace test case
2024-07-01 11:07:48 -07:00
Josh Yan
26e4e66faf
updated parsefile test
2024-07-01 09:43:49 -07:00
Josh Yan
9bd00041fa
trim all params
2024-06-27 11:18:38 -07:00
Josh Yan
4e986a823c
unquote, trimp space
2024-06-27 10:59:15 -07:00
Michael Yang
d528e1af75
fix utf16 for multibyte runes
2024-06-13 13:07:42 -07:00
Michael Yang
cd234ce22c
parser: add test for multibyte runes
2024-06-13 13:07:42 -07:00
Michael Yang
20b9f8e6f4
Revert "proper utf16 support"
...
This reverts commit 66ab48772f
.
this change broke utf-8 scanning of multi-byte runes
2024-06-13 10:22:16 -07:00
Michael Yang
66ab48772f
proper utf16 support
2024-06-05 13:11:50 -07:00
Michael Yang
e40145a39d
lint
2024-06-04 11:13:30 -07:00
Patrick Devine
ccdf0b2a44
Move the parser back + handle utf16 files ( #4533 )
2024-05-20 11:26:45 -07:00
Michael Yang
119589fcb3
rename parser to model/file
2024-05-01 09:53:50 -07:00
Michael Yang
bd8eed57fc
fix parser name
2024-05-01 09:52:54 -07:00
Michael Yang
9cf0f2e973
use parser.Format instead of templating modelfile
2024-05-01 09:52:54 -07:00
Michael Yang
176ad3aa6e
parser: add commands format
2024-05-01 09:52:54 -07:00
Michael Yang
4d08363580
comments
2024-05-01 09:52:54 -07:00
Michael Yang
8907bf51d2
fix multiline
2024-05-01 09:52:54 -07:00
Michael Yang
abe614c705
tests
2024-05-01 09:52:54 -07:00
Michael Yang
238715037d
linting
2024-05-01 09:52:54 -07:00
Michael Yang
c0a00f68ae
refactor modelfile parser
2024-05-01 09:52:54 -07:00
Patrick Devine
7c40a67841
Save and load sessions ( #2063 )
2024-01-25 12:12:36 -08:00
Daniel Hiltgen
fedd705aea
Mechanical switch from log to slog
...
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Patrick Devine
238ac5e765
Add unit tests for Parser ( #1815 )
2024-01-05 14:04:31 -08:00
Michael Yang
38fe1a368b
fix: trim space in modelfile fields
2023-12-05 11:57:29 -08:00
Bruce MacDonald
a0c3e989de
deprecate modelfile embed command ( #759 )
2023-10-16 11:07:37 -04:00
Michael Yang
6517bcc53c
Merge pull request #290 from jmorganca/add-adapter-layers
...
implement loading ggml lora adapters through the modelfile
2023-08-10 17:23:01 -07:00
Michael Yang
21e6197c0b
Merge pull request #322 from jmorganca/no-comment-warning
...
no warning on comments
2023-08-10 16:24:41 -07:00
Michael Yang
20bf000e55
no warning on comments
2023-08-10 16:22:38 -07:00
Michael Yang
40d0c4a1dc
length check for parameters
2023-08-10 16:09:02 -07:00
Michael Yang
6de5d032e1
implement loading ggml lora adapters through the modelfile
2023-08-10 09:23:39 -07:00
Bruce MacDonald
a6f6d18f83
embed text document in modelfile
2023-08-08 11:27:17 -04:00
Michael Yang
9c7f30d31c
use max scan token size to hold large objects
2023-07-28 11:43:31 -07:00
Michael Yang
f5ac8ddfb4
refactor scan multiline for reuse
2023-07-27 11:30:51 -07:00
Michael Yang
24c2c77057
fix multiline string
...
the data needs to remove the multiline quotes but include the command:
e.g.
TEMPLATE """
my template values
"""
should be
TEMPLATE
my template values
after scanning
2023-07-25 11:51:43 -07:00
Mohit Gaur
f5f79049c2
Incorporate code review improvements
2023-07-25 22:52:23 +05:30