Commit graph

  • 8bfe841ae1
    Recommend manual installs go to /usr/local AJ Jordan 2025-05-07 15:44:35 -04:00
  • 020644a6db
    Add BlazeLama to community section HardCodeDev 2025-05-07 22:47:51 +04:00
  • 3098c8b29b
    CI: trigger downstream release process (#10508) Daniel Hiltgen 2025-05-07 10:35:12 -07:00
  • 5e380c3b42
    sched: fix race leading to orphaned runners (#10599) Daniel Hiltgen 2025-05-07 09:38:17 -07:00
  • 392de84031
    api: remove unused RetrieveModelResponse type (#10603) Jeffrey Morgan 2025-05-06 23:08:03 -07:00
  • 591c3cb281 api: remove unused type jmorganca 2025-05-06 22:55:01 -07:00
  • a0c08915bd
    docs/distributed_inferencing.md: Fix spelling mistakes and grammar ecyht2 2025-05-07 09:05:55 +08:00
  • af31ccefc0
    fix data race in WriteGGUF (#10598) Daniel Hiltgen 2025-05-06 17:36:38 -07:00
  • fa393554b9
    remove cuda v11 (#10569) Daniel Hiltgen 2025-05-06 17:33:19 -07:00
  • c58cce8e18 remove cuda v11 Daniel Hiltgen 2025-05-05 09:54:23 -07:00
  • 19f7c50750 sched: fix race leading to orphaned runners Daniel Hiltgen 2025-05-06 10:24:53 -07:00
  • 881585b8e6 fix data race in WriteGGUF Daniel Hiltgen 2025-05-06 15:26:27 -07:00
  • 54db9872e7
    Merge branch 'ollama:main' into better_cpu_memory WingsDrafterwork 2025-05-07 00:00:42 +02:00
  • 307e3b3e1d
    readme: add Flufy to community integrations (#9719) Aharon Bensadoun 2025-05-07 00:47:35 +03:00
  • d20d8eace7
    Merge branch 'main' into patch-1 Jeffrey Morgan 2025-05-06 14:46:36 -07:00
  • 4090aca97b
    server: send 405 instead of 404 for unallowed methods (#10275) Devon Rifkin 2025-05-06 14:45:37 -07:00
  • 19dac72b74 Use system-wide KeepAlive setting for model preloader inactivity timeout Your Name 2025-05-06 22:48:55 +02:00
  • dad09a08f4 Make CPU model preloading optional with OLLAMA_PRELOAD_CPU_MODEL env var Your Name 2025-05-06 22:29:24 +02:00
  • d36383c00c Optimize CPU model performance by implementing model preloading to reduce first token latency Your Name 2025-05-06 22:26:52 +02:00
  • 92ce438de0
    server: remove internal cmd (#10595) Michael Yang 2025-05-06 13:05:01 -07:00
  • 5371e5d59b server: remove internal cmd Michael Yang 2025-05-06 12:21:26 -07:00
  • 66bbe45f57 lint: enable usetesting, disable tenv Michael Yang 2025-05-06 11:51:48 -07:00
  • 811b890687 fix: stream accumulator exits early Michael Yang 2025-05-06 11:30:10 -07:00
  • 424810450f
    Move quantization to new backend (#10363) Daniel Hiltgen 2025-05-06 11:20:48 -07:00
  • ed9acd023a Remove "add model quantizations" Daniel Hiltgen 2025-05-02 15:41:55 -07:00
  • 3b329d9b12 Move quantization logic to GGML via new backend Daniel Hiltgen 2025-04-18 14:45:12 -07:00
  • 95e744beeb
    discover: fix compiler warnings (#10572) Michael Yang 2025-05-06 10:49:22 -07:00
  • 3131734098
    Update README.md Roger Lee 2025-05-06 17:43:46 +08:00
  • 7fd48b5fb9 api: remove unused sampling parameters jmorganca 2025-05-05 22:13:24 -07:00
  • 6828e89c28
    Merged with main ecyht2 2025-05-06 08:37:42 +08:00
  • fa6f7ea53f discover: fix compiler warnings Michael Yang 2025-05-05 10:55:37 -07:00
  • 3b2d2c8326
    api: remove unused or unsupported api options (#10574) Jeffrey Morgan 2025-05-05 14:54:40 -07:00
  • 6dd9fdbf22 api: remove unused or unsupported api options jmorganca 2025-05-05 11:50:17 -07:00
  • d931ee8f22
    create blobs in parallel (#10135) Michael Yang 2025-05-05 11:59:26 -07:00
  • 0581130e4f
    Merge 024c358f1d into 7073600797 Matt Rutkowski 2025-05-05 20:55:43 +02:00
  • a0a1fb463a build: disable cuda compression jmorganca/cuda-compression-none jmorganca 2025-05-05 10:57:28 -07:00
  • 7073600797 ggml: Reduce log level of "key not found" Jesse Gross 2025-05-05 10:37:16 -07:00
  • 9d502f29a2 ggml: Reduce log level of "key not found" Jesse Gross 2025-05-05 10:37:16 -07:00
  • b1c40138da
    win: lint fix (#10571) Daniel Hiltgen 2025-05-05 11:08:12 -07:00
  • 50e6866313 win: lint fix Daniel Hiltgen 2025-05-05 10:55:48 -07:00
  • 17466217e5
    Hide empty terminal window (#8668) Ashok Gelal 2025-05-05 21:51:46 +05:45
  • 1703d1472e
    server: fix panic when runner.Options is nil (#10566) Jeffrey Morgan 2025-05-05 09:01:33 -07:00
  • 913905028b
    all: fix cgo compiler warnings on windows (#10563) Jeffrey Morgan 2025-05-05 08:02:39 -07:00
  • 0bd65195d3 Create a generic device properties cache, and use this in scale.cu Alastair D'Silva 2025-05-05 18:42:09 +10:00
  • 9a9daae266 server: fix panic when runner.Options is nil jmorganca 2025-05-04 20:56:02 -07:00
  • 7e5c8eee5c
    file close check and close. (#10554) 湛露先生 2025-05-05 06:37:59 +08:00
  • 65dabbf3a7 all: fix cgo compiler warnings on windows jmorganca 2025-05-04 13:02:02 -07:00
  • 629718129e
    Merge 105e82a13b into 6a74bba7e7 Ruslan Semagin 2025-05-04 19:13:38 +00:00
  • 0c7cf16928
    Merge 2069f0a83e into 6a74bba7e7 frob 2025-05-04 09:28:18 +00:00
  • 30a612d5f4
    runner/llamarunner/runner.go,runner/ollamarunner/runner.go: Added rpc support for runners ecyht2 2025-05-04 15:38:01 +08:00
  • 8e0a4e9619
    CMakeLists.txt: Added RPC support for ggml ecyht2 2025-05-03 18:19:19 +08:00
  • 3311c5b5be reverted to use the first layer kv cache for extra buffer calcualtion Tej Kiran 2025-05-04 02:34:59 +00:00
  • 5040ea9c80 Merge branch 'main' of github.com:itej89/ollama into main Tej Kiran 2025-05-04 01:43:20 +00:00
  • 8f2982940a RevertUpdated formatting Tej Kiran 2025-05-04 01:43:12 +00:00
  • 8ed740adf8
    Merge branch 'ollama:main' into main tej 2025-05-03 20:37:10 -05:00
  • ce5f662bab Reverted the buffer size calculation to use the first layer. Optimized overflow computation Tej Kiran 2025-05-04 01:36:35 +00:00
  • a8dec647dd
    file close check and close. zhanluxianshen 2025-05-04 08:39:31 +08:00
  • 6a74bba7e7
    win: ensure ollama paths come first (#10549) v0.6.8-rc0 v0.6.8 Daniel Hiltgen 2025-05-03 13:11:48 -07:00
  • 76ea735aaf
    sched: logging improvements (#10550) Daniel Hiltgen 2025-05-03 12:01:56 -07:00
  • d2bc16805e sched: logging improvements Daniel Hiltgen 2025-05-03 11:50:10 -07:00
  • 49a4599678 win: ensure ollama paths come first Daniel Hiltgen 2025-05-03 11:24:34 -07:00
  • 7669935af5
    Merge branch 'ollama:main' into chnxq/add-oneapi chnxq 2025-05-04 02:14:03 +08:00
  • f645eec1ea
    dicover/gpu.go: Updated RPC communication to support new protocol ecyht2 2025-05-03 15:40:35 +08:00
  • ca5c5677be
    server/sched.go: Fixed missing legacy gpu module ecyht2 2025-05-03 15:39:28 +08:00
  • 4915ef9c2f add param. OLLAMA_INTEL_IF_TYPE switch discover for intel GPU chnxq 2025-05-03 12:49:53 +08:00
  • 8068cd10cf
    Merge remote-tracking branch 'upstream/main' into feat/rpc ecyht2 2025-05-03 10:48:53 +08:00
  • b177dcf524
    Merge remote-tracking branch 'upstream/main' into feat/rpc ecyht2 2025-05-03 10:47:25 +08:00
  • dd1d4e99e7
    readme: add llama 4 models (#10530) aritra saha 2025-05-03 08:15:02 +05:30
  • b29fbdc8df ran gofmt on memory.go to pass the ci tests Tej Kiran 2025-05-03 00:05:05 +00:00
  • e51578bd21
    Merge branch 'ollama:main' into main tej 2025-05-02 19:01:05 -05:00
  • 611d3a17ed server: add python tool parsing logic ParthSareen 2025-04-28 17:10:40 -07:00
  • a6ef73f4f2 ggml: Fix race that resulted in "context canceled" when loading Jesse Gross 2025-05-01 17:06:53 -07:00
  • 53081fc41e create blobs in parallel Michael Yang 2025-04-04 18:26:49 -07:00
  • 5cde18a5d7 error on out of tree files Michael Yang 2025-05-01 15:19:56 -07:00
  • fcbb2d9348 ggml: Fix race that resulted in "context canceled" when loading Jesse Gross 2025-05-01 17:06:53 -07:00
  • 4dfb00e785 default max term height Michael Yang 2025-04-08 08:57:44 -07:00
  • c2f5d6662b ollamarunner: Re-enable worst case graph preallocation. Jesse Gross 2025-05-02 11:24:19 -07:00
  • 2ed1140856 ollamarunner: Re-enable worst case graph preallocation. Jesse Gross 2025-05-02 11:24:19 -07:00
  • 1726f8fd36
    docs: added OpenLLMetry link Nir Gazit 2025-05-02 15:56:48 +02:00
  • 378c5a766c
    Merge branch 'main' into main Bastian Machek 2025-05-02 15:43:35 +02:00
  • e7582cd01f
    add llama 4 models into readme aritra saha 2025-05-02 13:22:02 +05:30
  • 68d5a73766
    Merge 937722a2af into 57fb759f3c Dionisie Stratulat 2025-05-02 14:36:13 +08:00
  • 17cb39f7ef
    Merge 37a8c3a0ce into 57fb759f3c Vkanhan 2025-05-02 14:36:12 +08:00
  • 442767f62a
    Merge 0d12c086fc into 57fb759f3c frob 2025-05-02 14:36:09 +08:00
  • 6e89120065
    Merge 494d477843 into 57fb759f3c frob 2025-05-02 14:36:09 +08:00
  • df9a7a9efb
    Merge c01dd86a80 into 57fb759f3c qwerty108109 2025-05-02 14:36:06 +08:00
  • 57fb759f3c
    readme: update link to langchain in community integrations (#10465) Harsh Nevse 2025-05-02 11:38:51 +05:30
  • 3ece1f49b7 merge main of ollama chnxq 2025-05-02 12:43:13 +08:00
  • 8dd12c873d
    llama: update to commit e1e8e099 (#10513) Jeffrey Morgan 2025-05-01 18:24:09 -07:00
  • e6d2d04121
    image: add vision capability for projector-based models (#10509) frob 2025-05-02 01:50:20 +02:00
  • 074bac8447 kvcache: Log batch size if we can't find a slot Jesse Gross 2025-05-01 13:45:32 -07:00
  • 8e8f2c6d67 ollamarunner: Fix memory leak when processing images Jesse Gross 2025-05-01 11:34:02 -07:00
  • 301960cc95 ollamarunner: Fix memory leak when processing images Jesse Gross 2025-05-01 11:34:02 -07:00
  • 48e3548251 llama: update to e1e8e099 jmorganca 2025-04-30 22:14:33 -07:00
  • 938e8447e8
    readme: add Jirapt project to community integrations (#10522) AliAhmedNada 2025-05-02 00:49:47 +03:00
  • 444befa70f
    Update README.md Jeffrey Morgan 2025-05-01 14:49:13 -07:00
  • d5d5f0c445
    readme: change granite3.2 to granite3.3 (#10525) aritra saha 2025-05-02 03:16:09 +05:30
  • ec0ef4058d
    Merge branch 'ollama:main' into mmap frob 2025-05-01 22:57:55 +02:00
  • c16f1dfd6c kvcache: Log batch size if we can't find a slot Jesse Gross 2025-05-01 13:45:32 -07:00
  • 024c358f1d Remove temp. changes awaiting upstream merges Matt Rutkowski 2025-04-28 14:43:33 -05:00