mirror of
https://github.com/ollama/ollama.git
synced 2025-05-11 18:36:41 +02:00
The context (and therefore associated input tensors) was not being properly closed when images were being processed. We were trying to close them but in reality we were closing over an empty list, preventing anything from actually being freed. Fixes #10434 |
||
---|---|---|
.. | ||
common | ||
llamarunner | ||
ollamarunner | ||
README.md | ||
runner.go |
runner
Note: this is a work in progress
A minimial runner for loading a model and running inference via a http web server.
./runner -model <model binary>
Completion
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "hi"}' http://localhost:8080/completion
Embeddings
curl -X POST -H "Content-Type: application/json" -d '{"prompt": "turn me into an embedding"}' http://localhost:8080/embedding