ollama

mirror of https://github.com/ollama/ollama.git synced 2025-05-11 02:16:36 +02:00

Author	SHA1	Message	Date
Michael Yang	7ba9fa9c7d	fixes for maverick	2025-04-25 16:59:20 -07:00
Michael Yang	8bf11b84c1	chunked attention	2025-04-25 16:59:20 -07:00
Michael Yang	f0c66e6dea	llama4	2025-04-25 16:59:20 -07:00
Michael Yang	dc1e81f027	convert: use -1 for read all	2025-04-25 16:59:01 -07:00
Michael Yang	4892872c18	convert: change to colmajor	2025-04-25 15:27:39 -07:00
Michael Yang	2fec73eef6	fix write gguf padding	2025-04-16 10:24:35 -07:00
Bruce MacDonald	6bd0a983cd	model: support for mistral-small in the ollama runner Mistral is a popular research lab making open source models. This updates the forward pass of llama architecture models to support both llama models and mistral models by accounting for additional metadata present in mistral models, and finding the correct dimensions for the output projection.	2025-04-03 16:57:36 -07:00
Bruce MacDonald	9876c9faa4	chore(all): replace instances of interface with any (#10067 ) Both interface{} and any (which is just an alias for interface{} introduced in Go 1.18) represent the empty interface that all types satisfy.	2025-04-02 09:44:27 -07:00
Bruce MacDonald	61a8825216	convert: return name of unsupported architecture (#9862 ) When a model's architecture cannot be converted return the name of the unsupported arch in the error message.	2025-03-18 10:38:28 -07:00
Patrick Devine	80c7ce381b	fix: change default context size for gemma3 (#9744 )	2025-03-13 13:59:19 -07:00
jmorganca	83f0ec8269	all: address linter errors	2025-03-11 14:49:20 -07:00
Michael Yang	63a394068c	use 2d pooling	2025-03-11 14:49:20 -07:00
Patrick Devine	2e54d72fc3	fix gemma3 1b conversion	2025-03-11 14:49:19 -07:00
Michael Yang	6b32a2d549	compat with upstream gguf	2025-03-11 14:49:19 -07:00
Michael Yang	d368c039f0	skip repacking vision tensors	2025-03-11 14:49:19 -07:00
Patrick Devine	9b54267e69	fix configs	2025-03-11 14:49:19 -07:00
Michael Yang	46bb0169c4	update model	2025-03-11 14:49:19 -07:00
Patrick Devine	c62861f4fa	fix conversion	2025-03-11 14:49:18 -07:00
Michael Yang	0df1800436	set non-causal attention	2025-03-11 14:49:18 -07:00
Patrick Devine	631fecc6d9	temporary work around for converting spm	2025-03-11 14:49:18 -07:00
Michael Yang	4b037a97dc	add gemma vision encoder	2025-03-11 14:49:17 -07:00
Patrick Devine	5f74d1fd47	gemma2 impl	2025-03-11 14:35:08 -07:00
Michael Yang	58245413f4	next ollama runner (#7913 ) feat: add new Ollama engine using ggml through cgo This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this. - `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go` - `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go` - `ml.Tensor` defines the interface for a tensor and tensor operations This is the first implementation of the new engine. Follow up PRs will implement more features: - non-greedy sampling (#8410) - integration with Ollama and KV caching (#8301) - more model support (#9080) with more coming soon Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2025-02-13 16:31:21 -08:00
Josh	93a8daf285	convert: import support for command-r models from safetensors (#6063 ) --------- Co-authored-by: Patrick Devine <patrick@infrahq.com>	2025-01-15 16:31:22 -08:00
Bruce MacDonald	f6f3713001	convert: qwen2 from safetensors (#8408 ) Add native support for converting Qwen2 family models (including Qwen2.5) from safetensors to gguf format so we can run it.	2025-01-14 10:34:37 -08:00
Stefan Weil	abfdc4710f	all: fix typos in documentation, code, and comments (#7021 )	2024-12-10 12:58:06 -08:00
Michael Yang	4456012956	fix unmarshaling merges	2024-12-04 09:21:56 -08:00
Patrick Devine	c7cb0f0602	image processing for llama3.2 (#6963 ) Co-authored-by: jmorganca <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me> Co-authored-by: Jesse Gross <jesse@ollama.com>	2024-10-18 16:12:35 -07:00
Patrick Devine	84b84ce2db	catch when model vocab size is set correctly (#6714 )	2024-09-09 17:18:54 -07:00
Patrick Devine	608e87bf87	Fix gemma2 2b conversion (#6645 )	2024-09-05 17:02:28 -07:00
Michael Yang	9cfd2dd3e3	Merge pull request #6522 from ollama/mxyng/detect-chat detect chat template from configs that contain lists	2024-08-28 11:04:18 -07:00
Patrick Devine	6c1c1ad6a9	throw an error when encountering unsupport tensor sizes (#6538 )	2024-08-27 17:54:04 -07:00
Michael Yang	60e47573a6	more tokenizer tests	2024-08-27 14:51:10 -07:00
Michael Yang	eae3af6807	clean up convert tokenizer	2024-08-27 11:11:43 -07:00
Michael Yang	3eb08377f8	detect chat template from configs that contain lists	2024-08-27 10:49:33 -07:00
Patrick Devine	0c819e167b	convert safetensor adapters into GGUF (#6327 )	2024-08-23 11:29:56 -07:00
Michael Yang	77903ab8b4	llama3.1	2024-08-21 11:49:31 -07:00
Michael Yang	3546bbd08c	convert gemma2	2024-08-20 17:27:51 -07:00
Michael Yang	5a28b9cf5f	bert	2024-08-20 17:27:34 -07:00
Bruce MacDonald	aec77d6a05	support new "longrope" attention factor	2024-08-12 15:13:29 -07:00
Michael Yang	6ffb5cb017	add conversion for microsoft phi 3 mini/medium 4k, 128	2024-08-12 15:13:29 -07:00
Michael Yang	b732beba6a	lint	2024-08-01 17:06:06 -07:00
Michael Yang	d8e2664c33	convert: fix parse functions	2024-07-31 15:58:55 -07:00
Michael Yang	eafc607abb	convert: only extract large files	2024-07-31 15:58:55 -07:00
Michael Yang	781fc2d576	Update convert/reader_safetensors.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-07-31 15:58:55 -07:00
Michael Yang	df993fa37b	comments	2024-07-31 15:58:55 -07:00
Michael Yang	5e9db9fb0b	refactor convert	2024-07-31 15:58:33 -07:00
Michael Yang	6b252918fb	update convert test to check result data	2024-07-31 10:59:38 -07:00
Jeffrey Morgan	d835368eb8	convert: capture `head_dim` for mistral (#5818 )	2024-07-22 16:16:22 -04:00
Michael Yang	e40145a39d	lint	2024-06-04 11:13:30 -07:00

1 2

77 commits