Spaces:
Running
Running
Commit History
vulkan : add fp16 support for the conv_2d kernel (llama/14872)
48e92ad
Erik Scholz
commited on
vulkan: skip empty set_rows to avoid invalid API usage (llama/14860)
22fb24a
HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624)
5422b31
deepsek
commited on
CANN: Implement GLU ops (llama/14884)
851010b
musa: fix build warnings (unused variable) (llama/14869)
f38d409
ggml-cpu : disable GGML_NNPA by default due to instability (llama/14880)
cac085c
metal: SSM_SCAN performance (llama/14743)
5359e09
opencl: add fused `rms_norm_mul` (llama/14841)
5629961
lhez
commited on
ggml : remove invalid portPos specifiers from dot files (llama/14838)
a91e2f3
rpc : check for null buffers in get/set/copy tensor endpoints (llama/14868)
9a5c3ef
sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855)
e9f5612
Diego Devesa
commited on
musa: upgrade musa sdk to rc4.2.0 (llama/14498)
a687ec3
cmake : Indent ggml-config.cmake (ggml/1310)
6bdff5c
Kai Pastor
commited on
sycl: fixed semantics of block offset calculation (llama/14814)
d3d52a4
Alberto Cabrera Pérez
commited on
metal : fix fusion across different encoders (llama/14849)
17d67da
sycl: fix undefined variable in work group size check (llama/14843)
bcbbf47
Donghyeon Jeong
commited on
CUDA: fix overflow in FA, tune performance (llama/14840)
10ac92f
CUDA: fix compilation with GGML_CUDA_F16 (llama/14837)
2746afd
CUDA: fix quantized KV cache + multiple sequences (llama/14822)
88864af
ggml: fix loongarch quantize_row_q8_1 error (llama/14827)
0bd2be3
lixing-star
commited on
CANN: weight format to NZ for Ascend310P3 (llama/14407)
0274100
chen fan
commited on
CUDA: add fused rms norm (llama/14800)
79bc58c
vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817)
0c16b60
cuda : implement bf16 cpy ops and enable bf16 cont (llama/14763)
b54b644
Sigbjørn Skjæret
commited on
opencl: remove unreachable `return` (llama/14806)
cfa3731
lhez
commited on
cuda: remove linking to cublasLt (llama/14790)
fafaa8b
opencl: fix `im2col` when `KW!=KH` (llama/14803)
2fdd2df
Sigbjørn Skjæret
commited on
opencl: add conv2d kernel (llama/14403)
d579f20
sycl: Fix im2col (llama/14797)
931edc1
Romain Biessy
commited on
kleidiai: add support for get_rows (llama/14676)
43ba97c
Charles Xu
commited on
vulkan/cuda: Fix im2col when KW!=KH (llama/14789)
0be0329
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316)
5885084
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707)
0855a18
Peter0x44
commited on
Vulkan: Fix fprintf format-security warning (llama/14770)
77a1c11
cmake : fix usage issues (ggml/1257)
c38df55
Kai Pastor
commited on
ggml-cpu : remove stdlib include from repack.cpp (ggml/1276)
91c01e9
Support static xcframework packaging in build-xcframework.sh (#3322)
78de49d
unverified
examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332)
4a1f367
unverified
ci : add paths to build.yml (#3333)
6437539
unverified
musa: upgrade musa sdk to rc4.2.0 (#3324)
50c5b9e
unverified
R0CKSTAR
commited on
server : hide language probabilities option behind flag (#3328)
606bf70
unverified
go: fix Mac OS X builds (#3310)
2fd8067
unverified
BVK Chaitanya
Chaitanya Bayapuneni
commited on
sync : ggml
ebe9052
metal : fuse add, mul + add tests (llama/14596)
66ae493
cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741)
bb523fb
Oliver Simons
commited on
CUDA: set_rows + cpy.cu refactor (llama/14712)
536128f
use max work group size for device to replace the magic number (llama/14732)
e5e9b79
Neo Zhang Jianyu
commited on
ggml: Add initial WebGPU backend (llama/14521)
0dd208f
Reese Levine
commited on