Spaces:
Running
Running
Commit History
vulkan : add fp16 support for the conv_2d kernel (llama/14872) 48e92ad
vulkan: skip empty set_rows to avoid invalid API usage (llama/14860) 22fb24a
HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624) 5422b31
deepsek commited on
CANN: Implement GLU ops (llama/14884) 851010b
musa: fix build warnings (unused variable) (llama/14869) f38d409
ggml-cpu : disable GGML_NNPA by default due to instability (llama/14880) cac085c
metal: SSM_SCAN performance (llama/14743) 5359e09
opencl: add fused `rms_norm_mul` (llama/14841) 5629961
lhez commited on
ggml : remove invalid portPos specifiers from dot files (llama/14838) a91e2f3
rpc : check for null buffers in get/set/copy tensor endpoints (llama/14868) 9a5c3ef
sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855) e9f5612
Diego Devesa commited on
musa: upgrade musa sdk to rc4.2.0 (llama/14498) a687ec3
cmake : Indent ggml-config.cmake (ggml/1310) 6bdff5c
Kai Pastor commited on
sycl: fixed semantics of block offset calculation (llama/14814) d3d52a4
Alberto Cabrera Pérez commited on
metal : fix fusion across different encoders (llama/14849) 17d67da
sycl: fix undefined variable in work group size check (llama/14843) bcbbf47
Donghyeon Jeong commited on
CUDA: fix overflow in FA, tune performance (llama/14840) 10ac92f
CUDA: fix compilation with GGML_CUDA_F16 (llama/14837) 2746afd
CUDA: fix quantized KV cache + multiple sequences (llama/14822) 88864af
ggml: fix loongarch quantize_row_q8_1 error (llama/14827) 0bd2be3
lixing-star commited on
CANN: weight format to NZ for Ascend310P3 (llama/14407) 0274100
chen fan commited on
CUDA: add fused rms norm (llama/14800) 79bc58c
vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817) 0c16b60
cuda : implement bf16 cpy ops and enable bf16 cont (llama/14763) b54b644
Sigbjørn Skjæret commited on
opencl: remove unreachable `return` (llama/14806) cfa3731
lhez commited on
cuda: remove linking to cublasLt (llama/14790) fafaa8b
opencl: fix `im2col` when `KW!=KH` (llama/14803) 2fdd2df
Sigbjørn Skjæret commited on
opencl: add conv2d kernel (llama/14403) d579f20
sycl: Fix im2col (llama/14797) 931edc1
Romain Biessy commited on
kleidiai: add support for get_rows (llama/14676) 43ba97c
Charles Xu commited on
vulkan/cuda: Fix im2col when KW!=KH (llama/14789) 0be0329
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316) 5885084
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707) 0855a18
Peter0x44 commited on
Vulkan: Fix fprintf format-security warning (llama/14770) 77a1c11
cmake : fix usage issues (ggml/1257) c38df55
Kai Pastor commited on
ggml-cpu : remove stdlib include from repack.cpp (ggml/1276) 91c01e9
Support static xcframework packaging in build-xcframework.sh (#3322) 78de49d unverified
examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332) 4a1f367 unverified
ci : add paths to build.yml (#3333) 6437539 unverified
musa: upgrade musa sdk to rc4.2.0 (#3324) 50c5b9e unverified
R0CKSTAR commited on
server : hide language probabilities option behind flag (#3328) 606bf70 unverified
go: fix Mac OS X builds (#3310) 2fd8067 unverified
BVK Chaitanya Chaitanya Bayapuneni commited on
sync : ggml ebe9052
metal : fuse add, mul + add tests (llama/14596) 66ae493
cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741) bb523fb
Oliver Simons commited on
CUDA: set_rows + cpy.cu refactor (llama/14712) 536128f
use max work group size for device to replace the magic number (llama/14732) e5e9b79
Neo Zhang Jianyu commited on
ggml: Add initial WebGPU backend (llama/14521) 0dd208f
Reese Levine commited on