Spaces:
Running
Running
Commit History
whisper : remove unnecessary GGML_UNUSED macro (#2960)
b2e42a5
unverified
sync : ggml
91362b9
metal : improve FA + improve MoE (llama/12612)
04a3389
vulkan: fix coopmat shader generation when cross-compiling (llama/12272)
7585c4a
Icenowy Zheng
bandoti
commited on
llamafile : ppc64le GEMV forwarding for FP32. (llama/12594)
1843f18
amritahs-ibm
commited on
rpc : send hash when tensor data is above some fixed threshold (llama/12496)
c39f9c4
opencl: add multi and vision rope, `gelu_quick` and `im2col` (llama/12600)
3261fcd
lhez
commited on
bindings.go : add DetectedLanguage to go bindings (#2947)
1830e27
unverified
Amanda Der Bedrosian
Amanda Der Bedrosian
commited on
ruby : fix test failures in test_whisper (#2955)
2ccaffe
unverified
examples : support progress_callback API for addon.node (#2941)
3f6a806
unverified
Lin Xiaodong
linxiaodong
commited on
xcf : fix visionOS build
2220ea9
files : remove old wkv6 (#0)
ee92ae5
sync : ggml
9745a6d
ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0)
f695cbf
llamafile : ppc64le MMA implementation for Q4_0. (llama/12489)
d154905
amritahs-ibm
commited on
SYCL: implement memset ggml backend buffer interface (llama/12580)
3f95f2b
Akarshan Biswas
commited on
HIP: Add support for RDNA4 targets (llama/12372)
a73f01f
Slobodan Josic
commited on
metal : refactor mat-vec code (llama/12569)
71d72f9
ggml : fix MUL_MAT_ID repack with Q8_K (llama/12544)
a13f78c
ggml-cpu : update KleidiAI to v1.5.0 (llama/12568)
9b4460a
Dan Johansson
commited on
SYCL: disable Q4_0 reorder optimization (llama/12560)
33f8316
Akarshan Biswas
commited on
opencl: simplify kernel embedding logic in cmakefile (llama/12503)
5f131ac
lhez
Max Krasnyansky
commited on
CUDA: Fix clang warnings (llama/12540)
efa6dac
R0CKSTAR
commited on
vulkan: fix mul_mat_vec failure in backend tests (llama/12529)
09dd86a
ggml : fix quantized cpy op (llama/12310)
608b377
musa: refine compute capability (llama/12493)
5e508d2
R0CKSTAR
commited on
vulkan: Optimize mul_mat_vec p021 and nc shaders (llama/12505)
6868981
Vulkan: RTE rounding for cpy to quant (llama/12480)
8707beb
vulkan: workaround for AMD Windows driver 16 bit unpack8 bug (llama/12472)
417a5d6
Eve
commited on
Fix build on Windows when ccache enabled (ggml/9954) (llama/9976)
bbd0292
蕭澧邦
Romain Biessy
commited on
sycl: cleanup oneDNN related code (llama/12097)
959346b
Svetlozar Georgiev
commited on
ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (llama/12332)
0729506
Srihari-mcw
commited on
CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (llama/12183)
3a7ca19
vulkan: optimize iq1 coopmat2 dequant functions (llama/12427)
53dd8ad
Fix visionOS build and add CI (llama/12415)
ecb4322
vulkan: Submit once enough matmul work has been recorded (llama/12406)
ec77b2c
opencl: improve profiling (llama/12442)
4abe3ae
lhez
commited on
musa: override warp_size of musa device to 32 (llama/12445)
184c152
R0CKSTAR
commited on
SYCL: using graphs is configurable by environment variable and compile option (llama/12371)
c18969f
Łukasz Ślusarczyk
Romain Biessy
commited on
ggml : add SVE support for q6_K_q8_K (llama/12361)
607a196
fj-y-saito
commited on
Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentation and driver issues (llama/12434)
55088d3
fixed compilation warnings in ggml-sycl (llama/12424)
77ff985
Łukasz Ślusarczyk
commited on
llama: Add support for RWKV v7 architecture (llama/12412)
727de7e
cuda : enable CUDA Graph on CUDA Toolkit < 12.x (llama/12394)
1e69b8c
Gaurav Garg
commited on