Default branch

3fa78598a1 · cmd: strip single quotes from image page (#10636) · Updated 2025-05-10 03:05:43 +02:00

Branches

086d683f9c · ollamarunner: Multi-modal worst case graph · Updated 2025-05-10 02:23:18 +02:00

1
3

46c95b25dd · checkpoint · Updated 2025-05-10 02:05:16 +02:00

27
8

f1a561ea63 · remove unpad · Updated 2025-05-10 01:22:08 +02:00

1
4

2b2a0d2308 · feat: add threshold to dump options · Updated 2025-05-10 00:52:19 +02:00

1
1

5e6a5b2b36 · Delete 0017-add-ollama-vocab-for-grammar-support.patch · Updated 2025-05-10 00:11:17 +02:00

1
47

20c5fd39c8 · Merge branch 'main' into drifkin/array-head-count-simple · Updated 2025-05-08 20:46:52 +02:00

2
3

715952705e · model: framework for testing forward pass · Updated 2025-05-08 18:25:12 +02:00

4
1

23e8ac9428 · wip? · Updated 2025-05-08 04:00:44 +02:00

51
2

77f4594e80 · WIP thinking API support · Updated 2025-05-08 01:15:46 +02:00

36
1

1546bc4767 · feat: qwen3 dense · Updated 2025-05-07 23:03:27 +02:00

6
1

855de683ca · get eos_token_id from generation_config.json · Updated 2025-05-06 08:51:35 +02:00

16
1

a0a1fb463a · build: disable cuda compression · Updated 2025-05-05 20:20:57 +02:00

18
1

67335dede2 · lower default NUM_PARALLEL to 2 · Updated 2025-04-29 11:03:51 +02:00    mirrors

47
1

d20cd8df80 · fix incorrect chat truncation · Updated 2025-04-29 01:11:36 +02:00    mirrors

51
1

f4ab82f0b4 · llama: sync · Updated 2025-04-26 01:38:05 +02:00    mirrors

68
1

34ae8077d1 · wip: write tensors in parallel · Updated 2025-04-25 22:39:12 +02:00    mirrors

70
3

b4cd1118ab · checkpoint for vscode · Updated 2025-04-25 03:23:23 +02:00    mirrors

102
4

7c94471d38 · ggml: more accurate estimates for head count array case · Updated 2025-04-11 01:28:34 +02:00    mirrors

102
2

04950140ec · server: do not attempt to parse offset file as gguf · Updated 2025-04-09 18:41:46 +02:00    mirrors

105
1

3bc9d42e2e · rebase + fix tests · Updated 2025-04-04 02:31:21 +02:00    mirrors

119
2