ollama/model
Jesse Gross 282bfaaa95 ollamarunner: Use a separate context per multimodal input
Currently there is a single context per sequence, shared all by
all multimodal inputs. Since we build a vision encoder graph per
image, with a large number of inputs we can eventually hit the
maximum number of graph nodes per context.

This changes to use a separate context for each image, ensuring
that available resource limits are consistent.
2025-03-14 15:38:54 -07:00
..
imageproc
input ml: Allow models to constrain inputs to a single batch 2025-03-14 15:38:54 -07:00
models ollamarunner: Use a separate context per multimodal input 2025-03-14 15:38:54 -07:00
testdata gemma2 impl 2025-03-11 14:35:08 -07:00
model.go ollamarunner: Use a separate context per multimodal input 2025-03-14 15:38:54 -07:00
model_test.go model: Update encoder cache to use multimodal input processing handler 2025-03-09 17:05:26 -07:00
process_text.go set non-causal attention 2025-03-11 14:49:18 -07:00
process_text_spm.go model: validate left and right pairs before merging them 2025-03-11 14:49:20 -07:00
process_text_spm_test.go model: add more spm tokenizer tests 2025-03-11 14:49:20 -07:00
process_text_test.go model: Don't unconditionally add special tokens 2025-03-06 16:54:16 -08:00