ollama

mirror of https://github.com/ollama/ollama.git synced 2025-05-11 02:16:36 +02:00

History

Devon Rifkin 7c94471d38 ggml: more accurate estimates for head count array case Also standardized the approach by always treatting `HeadCount()` and `HeadCountKV()` as arrays by filling them with the same value when they're a scalar in the original GGUF		2025-04-10 16:28:34 -07:00
..
llm_darwin.go	Optimize container images for startup (#6547 )	2024-09-12 12:10:30 -07:00
llm_linux.go	Optimize container images for startup (#6547 )	2024-09-12 12:10:30 -07:00
llm_windows.go	runner: Set windows above normal priority (#6905 )	2024-09-21 16:54:49 -07:00
memory.go	ggml: more accurate estimates for head count array case	2025-04-10 16:28:34 -07:00
memory_test.go	ggml: Support heterogeneous KV cache layer sizes in memory estimation	2025-03-26 13:16:03 -07:00
server.go	llm: set done reason at server level (#9830 )	2025-04-03 10:19:24 -07:00
server_test.go	llm: do not error on "null" format (#8139 )	2024-12-17 09:49:37 -08:00
status.go	Improve crash reporting (#7728 )	2024-11-19 16:26:57 -08:00