Spaces:
Running
Running
Commit History
vulkan : reuse parent extra for views (llama/7806)
b9b60de
fix softmax r2r result wrong issue (llama/7811)
c3a7159
CUDA: refactor mmq, dmmv, mmvq (llama/7716)
849ff52
ggml : refactor rope norm/neox (llama/7634)
ded0c68
Allow number of nodes in CUDA graph to change (llama/7738)
6124287
agray3
commited on
ggml : remove OpenCL (llama/7735)
4ff3b72
ggml : prevent builds with -ffinite-math-only (llama/7726)
154f0f8
llama : offload to RPC in addition to other backends (llama/7640)
eab8082
ggml : use OpenMP as a thread pool (llama/7606)
7e5d850
Vulkan Mixture of Experts (MoE) support (llama/7628)
ad9ee26
kompute : implement op_getrows_f32 (llama/6403)
fa0872f
woachk
commited on
fix bug introduced in using calloc (llama/7701)
f22c7e4
Dave Airlie
commited on
Fix FlashAttention debug test, FP32 assert (llama/7684)
1bed92f
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (llama/7681)
d4c0faf
CUDA: quantized KV support for FA vec (llama/7527)
315df8c
ggml : fix loongson compile warnings (llama/7537)
c1442f3
faster avx512 exp implementation (llama/7551)
6dbbbab
ggml : fix loongarch build (O2 issue) (llama/7636)
133ffbf
junchao-loongson
commited on
metal : remove invalid asserts (llama/7617)
562afce
metal : add missing asserts (llama/7617)
be552ab
ggml : fix YARN + add tests + add asserts (llama/7617)
15da5f7
cuda : non-cont concat support (llama/7610)
64d3007
llama-bench : add support for the RPC backend (llama/7435)
d460266
ggml : use atomic_flag for critical section (llama/7598)
68c6582
slaren
commited on
examples : adapt to new ggml_concat (ggml/0)
36af6c5
ggml : fix typo in ggml.c (llama/7603)
f06f1cb
Align GEMM dispatch (llama/7566)
2171dc6
sycl : fix assert (llama/7563)
b4fb287
vulkan: properly initialize vulkan devices for LLAMA_SPLIT_MODE_NONE (llama/7552)
da90a1e
rpc : resource management rework (llama/7562)
7571b13
fix ggml_sycl_mul_mat_id() to match the change of api (llama/7436)
f0ee71c
Neo Zhang
commited on
ggml : generalize GGML_OP_CONCAT (llama/7563)
8d359ad
update HIP_UMA #7399 (llama/7414)
7097123
Allow multiple copy function pointers for CUDA graph kernel param updates (llama/7565)
143f6df
agray3
commited on
Fix q_xxs using mul_mat_q (llama/7459)
0be4f48
AidanBeltonS
commited on
Add freq factors (llama/7495)
340b830
AidanBeltonS
commited on
metal : add GGML_OP_REPEAT kernels (llama/7557)
0534b5d
metal : disable FA kernel for HS=256 (llama/7556)
0c32e28
ggml : restore ggml_rope_xpos_inplace (ggml/0)
0641dee
ggml: aarch64: SVE kernels for q8_0_q8_0, q4_0_q8_0 vector dot (llama/7433)
51f504f
Masaya, Kato
commited on