Commits · Xenobd/whisper.cpp

sync : ggml

7d38d31

ggerganov commited on Jul 28, 2025

vulkan : add fp16 support for the conv_2d kernel (llama/14872)

48e92ad

Erik Scholz commited on Jul 27, 2025

vulkan: skip empty set_rows to avoid invalid API usage (llama/14860)

22fb24a

jeffbolznv commited on Jul 27, 2025

HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624)

5422b31

deepsek commited on Jul 26, 2025

CANN: Implement GLU ops (llama/14884)

851010b

hipudding commited on Jul 26, 2025

musa: fix build warnings (unused variable) (llama/14869)

f38d409

yeahdongcn commited on Jul 26, 2025

ggml-cpu : disable GGML_NNPA by default due to instability (llama/14880)

cac085c

taronaeo commited on Jul 25, 2025

metal: SSM_SCAN performance (llama/14743)

5359e09

gabegoodhart

ggerganov commited on Jul 25, 2025

opencl: add fused `rms_norm_mul` (llama/14841)

5629961

lhez commited on Jul 25, 2025

ggml : remove invalid portPos specifiers from dot files (llama/14838)

a91e2f3

ORippler commited on Jul 25, 2025

rpc : check for null buffers in get/set/copy tensor endpoints (llama/14868)

9a5c3ef

ChrisRohlf commited on Jul 25, 2025

sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855)

e9f5612

Diego Devesa commited on Jul 25, 2025

musa: upgrade musa sdk to rc4.2.0 (llama/14498)

a687ec3

yeahdongcn commited on Jul 24, 2025

cmake : Indent ggml-config.cmake (ggml/1310)

6bdff5c

Kai Pastor commited on Jul 24, 2025

sycl: fixed semantics of block offset calculation (llama/14814)

d3d52a4

Alberto Cabrera Pérez commited on Jul 24, 2025

metal : fix fusion across different encoders (llama/14849)

17d67da

ggerganov commited on Jul 24, 2025

sycl: fix undefined variable in work group size check (llama/14843)

bcbbf47

Donghyeon Jeong commited on Jul 24, 2025

CUDA: fix overflow in FA, tune performance (llama/14840)

10ac92f

JohannesGaessler commited on Jul 23, 2025

CUDA: fix compilation with GGML_CUDA_F16 (llama/14837)

2746afd

JohannesGaessler commited on Jul 23, 2025

CUDA: fix quantized KV cache + multiple sequences (llama/14822)

88864af

JohannesGaessler

ggerganov commited on Jul 23, 2025

ggml: fix loongarch quantize_row_q8_1 error (llama/14827)

0bd2be3

lixing-star commited on Jul 23, 2025

CANN: weight format to NZ for Ascend310P3 (llama/14407)

0274100

chen fan commited on Jul 23, 2025

CUDA: add fused rms norm (llama/14800)

79bc58c

am17an commited on Jul 23, 2025

vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817)

0c16b60

jeffbolznv commited on Jul 22, 2025

cuda : implement bf16 cpy ops and enable bf16 cont (llama/14763)

b54b644

Sigbjørn Skjæret commited on Jul 22, 2025

opencl: remove unreachable `return` (llama/14806)

cfa3731

lhez commited on Jul 22, 2025

cuda: remove linking to cublasLt (llama/14790)

fafaa8b

yeahdongcn commited on Jul 21, 2025

opencl: fix `im2col` when `KW!=KH` (llama/14803)

2fdd2df

Sigbjørn Skjæret commited on Jul 21, 2025

opencl: add conv2d kernel (llama/14403)

d579f20

mrfatso commited on Jul 21, 2025

sycl: Fix im2col (llama/14797)

931edc1

Romain Biessy commited on Jul 21, 2025

kleidiai: add support for get_rows (llama/14676)

43ba97c

Charles Xu commited on Jul 21, 2025

vulkan/cuda: Fix im2col when KW!=KH (llama/14789)

0be0329

jeffbolznv commited on Jul 21, 2025

ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316)

5885084

etasnadi commited on Jul 19, 2025

vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707)

0855a18

Peter0x44 commited on Jul 19, 2025

Vulkan: Fix fprintf format-security warning (llama/14770)

77a1c11

OccamRazor commited on Jul 19, 2025

cmake : fix usage issues (ggml/1257)

c38df55

Kai Pastor commited on Jul 22, 2025

ggml-cpu : remove stdlib include from repack.cpp (ggml/1276)

91c01e9

danbev commited on Jul 21, 2025

Support static xcframework packaging in build-xcframework.sh (#3322)

78de49d
unverified

Rich Waters

danbev commited on Jul 26, 2025

examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332)

4a1f367
unverified

danbev commited on Jul 24, 2025

ci : add paths to build.yml (#3333)

6437539
unverified

danbev commited on Jul 24, 2025

musa: upgrade musa sdk to rc4.2.0 (#3324)

50c5b9e
unverified

R0CKSTAR commited on Jul 24, 2025

server : hide language probabilities option behind flag (#3328)

606bf70
unverified

sachaarbonel commited on Jul 21, 2025

go: fix Mac OS X builds (#3310)

2fd8067
unverified

BVK Chaitanya Chaitanya Bayapuneni commited on Jul 21, 2025

sync : ggml

ebe9052

ggerganov commited on Jul 19, 2025

metal : fuse add, mul + add tests (llama/14596)

66ae493

ggerganov commited on Jul 18, 2025

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741)

bb523fb

Oliver Simons commited on Jul 18, 2025

CUDA: set_rows + cpy.cu refactor (llama/14712)

536128f

am17an commited on Jul 18, 2025

use max work group size for device to replace the magic number (llama/14732)

e5e9b79

Neo Zhang Jianyu commited on Jul 18, 2025

ggml: Add initial WebGPU backend (llama/14521)

0dd208f

Reese Levine commited on Jul 16, 2025

llama : add high-throughput mode (llama/14363)

b2d73a2

ggerganov

JohannesGaessler commited on Jul 16, 2025

Commit History

sync : ggml 7d38d31

vulkan : add fp16 support for the conv_2d kernel (llama/14872) 48e92ad

vulkan: skip empty set_rows to avoid invalid API usage (llama/14860) 22fb24a

HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624) 5422b31

CANN: Implement GLU ops (llama/14884) 851010b

musa: fix build warnings (unused variable) (llama/14869) f38d409

ggml-cpu : disable GGML_NNPA by default due to instability (llama/14880) cac085c

metal: SSM_SCAN performance (llama/14743) 5359e09

opencl: add fused `rms_norm_mul` (llama/14841) 5629961

ggml : remove invalid portPos specifiers from dot files (llama/14838) a91e2f3

rpc : check for null buffers in get/set/copy tensor endpoints (llama/14868) 9a5c3ef

sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855) e9f5612

musa: upgrade musa sdk to rc4.2.0 (llama/14498) a687ec3

cmake : Indent ggml-config.cmake (ggml/1310) 6bdff5c

sycl: fixed semantics of block offset calculation (llama/14814) d3d52a4

metal : fix fusion across different encoders (llama/14849) 17d67da

sycl: fix undefined variable in work group size check (llama/14843) bcbbf47

CUDA: fix overflow in FA, tune performance (llama/14840) 10ac92f

CUDA: fix compilation with GGML_CUDA_F16 (llama/14837) 2746afd

CUDA: fix quantized KV cache + multiple sequences (llama/14822) 88864af

ggml: fix loongarch quantize_row_q8_1 error (llama/14827) 0bd2be3

CANN: weight format to NZ for Ascend310P3 (llama/14407) 0274100

CUDA: add fused rms norm (llama/14800) 79bc58c

vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817) 0c16b60

cuda : implement bf16 cpy ops and enable bf16 cont (llama/14763) b54b644

opencl: remove unreachable `return` (llama/14806) cfa3731

cuda: remove linking to cublasLt (llama/14790) fafaa8b

opencl: fix `im2col` when `KW!=KH` (llama/14803) 2fdd2df

opencl: add conv2d kernel (llama/14403) d579f20

sycl: Fix im2col (llama/14797) 931edc1

kleidiai: add support for get_rows (llama/14676) 43ba97c

vulkan/cuda: Fix im2col when KW!=KH (llama/14789) 0be0329

ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316) 5885084

vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707) 0855a18

Vulkan: Fix fprintf format-security warning (llama/14770) 77a1c11

cmake : fix usage issues (ggml/1257) c38df55

ggml-cpu : remove stdlib include from repack.cpp (ggml/1276) 91c01e9

Support static xcframework packaging in build-xcframework.sh (#3322) 78de49d unverified

examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332) 4a1f367 unverified

ci : add paths to build.yml (#3333) 6437539 unverified

musa: upgrade musa sdk to rc4.2.0 (#3324) 50c5b9e unverified

server : hide language probabilities option behind flag (#3328) 606bf70 unverified

go: fix Mac OS X builds (#3310) 2fd8067 unverified

sync : ggml ebe9052

metal : fuse add, mul + add tests (llama/14596) 66ae493

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741) bb523fb

CUDA: set_rows + cpy.cu refactor (llama/14712) 536128f

use max work group size for device to replace the magic number (llama/14732) e5e9b79

ggml: Add initial WebGPU backend (llama/14521) 0dd208f

llama : add high-throughput mode (llama/14363) b2d73a2