whisper.cpp / ggml

Commit History

ggml: Add initial WebGPU backend (llama/14521)
4b3da1d

Reese Levine commited on

ggml : initial zDNN backend (llama/14975)
6dd510c

taronaeo commited on

ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (llama/15379)
a575f57

compilade commited on

vulkan: disable spirv-opt for bfloat16 shaders (llama/15352)
cf24af7

jeffbolznv commited on

vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355)
054584a

jeffbolznv OccamRazor commited on

vulkan: support sqrt (llama/15370)
e5406c0

Dong Won Kim commited on

vulkan: Optimize argsort (llama/15354)
80a188c

jeffbolznv commited on

vulkan: fuse adds (llama/15252)
ad199b1

jeffbolznv commited on

vulkan: Support mul_mat_id with f32 accumulators (llama/15337)
41a76e6

jeffbolznv commited on

vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334)
a6fa78e

jeffbolznv commited on

OpenCL: add initial FA support (llama/14987)
8ece1ee

mrfatso commited on

opencl: add initial mxfp4 support via mv (llama/15270)
1a0281c

lhez shawngu-quic commited on

vulkan : fix out-of-bounds access in argmax kernel (llama/15342)
78a1865

ggerganov commited on

vulkan : fix compile warnings on macos (llama/15340)
e3107ff

ggerganov commited on

ggml: initial IBM zDNN backend (llama/14975)
449e1a4

taronaeo commited on

CUDA: fix negative KV_max values in FA (llama/15321)
6e3a7b6

JohannesGaessler commited on

vulkan: perf_logger improvements (llama/15246)
d48d508

jeffbolznv commited on

ggml: fix ggml_conv_1d_dw bug (ggml/1323)
4496862

jasonni2 commited on

cuda : fix GGML_CUDA_GRAPHS=OFF (llama/15300)
59c694d

Sigbjørn Skjæret commited on

HIP: bump requirement to rocm 6.1 (llama/15296)
58a3802

uvos commited on

ggml : update `ggml_rope_multi` (llama/12665)
b4896dc

Judd ggerganov commited on

ggml : repack block_iq4_nlx8 (llama/14904)
db4407f

ggerganov commited on

CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (llama/15132)
c768824

ORippler commited on

ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors over RPC (macOS & others) (llama/15188)
c8284f2

aixsatoshi Shinnosuke Takagi commited on

HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273)
8fca6dd

uvos commited on

sycl: Fix and disable more configurations of mul_mat (llama/15151)
7b868ed

Romain Biessy commited on

opencl: allow mixed f16/f32 `add` (llama/15140)
345810b

mrfatso commited on

CUDA cmake: add `-lineinfo` for easier debug (llama/15260)
008e169

am17an commited on

CANN: GGML_OP_CPY optimization (llama/15070)
73e90ff

Chenguang Li commited on

musa: fix failures in test-backend-ops for mul_mat_id op (llama/15236)
4168dda

yeahdongcn commited on

CANN: Add broadcast for softmax and FA (llama/15208)
db87c9d

hipudding commited on

kleidiai: fix unsigned overflow bug (llama/15150)
9d5f58c

Charles Xu commited on

cuda: refactored ssm_scan and use CUB (llama/13291)
7a187d1

David Zhao commited on

CUDA: add attention sinks for tile and wmma (llama/15178)
46e7c87

am17an commited on

gguf-py : add Numpy MXFP4 de/quantization support (llama/15111)
324f3bd

compilade commited on

ggml : fix field name when new ggml_backend (llama/14944)
685748d

AN Long commited on

CUDA: attention sinks for mma FlashAttention (llama/15157)
0ab9aba

JohannesGaessler commited on

opencl: support sink in `soft_max` (attn sinks) (llama/15152)
d8664e4

lhez commited on

vulkan: support fattn sinks (llama/15126)
d7e9115

jeffbolznv commited on

vulkan: Add env var to disable host visible vidmem (llama/15109)
5ec4382

jeffbolznv commited on

HIP: add cmake option to enable compiler output of kernel resource usage metrics (llama/15103)
577f7e4

uvos commited on

ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (llama/15094)
f84562e

Christian Kastner commited on

CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (llama/15131)
1d24833

JohannesGaessler commited on

fix profiling crash (llama/15072)
67ec576

mrfatso commited on

opencl: add `swiglu_oai` and `add_id` (llama/15121)
1c97db6

lhez commited on

ggml : fix fallback to CPU for ununsupported ops (llama/15118)
2b7ae5e

Diego Devesa commited on

CANN: add support for ACL Graph (llama/15065)
137a0dc

Chenguang Li commited on

llama : add gpt-oss (llama/15091)
bf225d6

ggerganov ngxson HF Staff slaren commited on