mirror of https://github.com/ollama/ollama.git synced 2025-05-11 18:36:41 +02:00

History

Jesse Gross 0ff28758b3 ollamarunner: Provide mechanism for backends to report loading progress This enables the runner to report progress back to the Ollama server, both for showing status to the user and also to prevent the server from killing the runner if it thinks things have stalled. Most of the infrastructure was already there, this extends it to be available to the backends.		2025-03-21 10:44:26 -07:00
..
common	Runner for Ollama engine	2025-02-13 17:09:26 -08:00
llamarunner	llm: remove internal subprocess req and resp types (#9324 )	2025-03-14 15:21:53 -07:00
ollamarunner	ollamarunner: Provide mechanism for backends to report loading progress	2025-03-21 10:44:26 -07:00
README.md	Runner for Ollama engine	2025-02-13 17:09:26 -08:00
runner.go	Runner for Ollama engine	2025-02-13 17:09:26 -08:00

`runner`

Note: this is a work in progress

A minimial runner for loading a model and running inference via a http web server.

./runner -model <model binary>

curl -X POST -H "Content-Type: application/json" -d '{"prompt": "hi"}' http://localhost:8080/completion

curl -X POST -H "Content-Type: application/json" -d '{"prompt": "turn me into an embedding"}' http://localhost:8080/embedding