ollama/runner/ollamarunner
Jesse Gross fe623c2cf4 ollamarunner: Multi-modal worst case graph
We currently preallocate compute graph memory for the worst case
batch of text tokens. This adds support for doing the same for
images.

Note that image models are more complicated than text models in
how they process their inputs so there may be cases where this
approach isn't completely generic for all models. It covers all
currently supported models though.
2025-05-15 13:46:20 -07:00
..
cache.go ollamarunner: Base cached tokens on current prompt 2025-05-15 13:46:20 -07:00
cache_test.go ollamarunner: Separate text and multimodal graphs 2025-05-15 13:46:20 -07:00
multimodal.go ollamarunner: Multi-modal worst case graph 2025-05-15 13:46:20 -07:00
runner.go ollamarunner: Multi-modal worst case graph 2025-05-15 13:46:20 -07:00