Commits · Xenobd/whisper.cpp

sync : ggml

116dcaa

ggerganov commited on Jul 12, 2025

sync : resolve conflicts (ggml/0)

497add0

ggerganov commited on Jul 12, 2025

vulkan: support SET_ROWS (llama/14587)

9821f43

jeffbolznv commited on Jul 12, 2025

vulkan: optimizations for deepseek prompt processing (llama/14555)

04b631e

jeffbolznv commited on Jul 12, 2025

model : support LiquidAI LFM2 hybrid family (llama/14620)

07ff90a

Tarek Dakhran commited on Jul 11, 2025

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634)

4354560

Slobodan Josic commited on Jul 11, 2025

opencl: add tiled mul_mat_f16_f32 (llama/14535)

398dc49

mrfatso commited on Jul 10, 2025

opencl: add `set_rows` for `f16` and `f32` (llama/14547)

5e203ec

lhez commited on Jul 10, 2025

SYCL: Initial set_rows kernel implementation (llama/14562)

e62ef85

Akarshan Biswas commited on Jul 10, 2025

cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602)

92b2d32

compilade commited on Jul 10, 2025

ggml : add ggml_scale_bias (llama/14417)

573d50a

ngxson HF Staff commited on Jul 9, 2025

ggml : prevent integer overflow in gguf tensor size calculation (llama/14595)

31f34e7

yuuoniy commited on Jul 9, 2025

vulkan: optimize flash attention split_k_reduce (llama/14554)

45fbb42

jeffbolznv commited on Jul 8, 2025

vulkan : fix rope with partial rotation and non-cont src (llama/14582)

367fa85

jeffbolznv commited on Jul 8, 2025

cuda : fix rope with partial rotation and non-cont src (llama/14580)

aaf2d96

ggerganov commited on Jul 8, 2025

CUDA: add bilinear interpolation for upscale (llama/14563)

68ded09

am17an commited on Jul 8, 2025

musa: fix build warnings (unused variable) (llama/14561)

891b1d1

yeahdongcn commited on Jul 7, 2025

CUDA: add bf16 and i32 to getrows (llama/14529)

014494c

am17an commited on Jul 7, 2025

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485)

effd61f

Eve Rémy Oudompheng commited on Jul 6, 2025

vulkan: fix rms_norm+mul fusion (llama/14545)

0791e65

jeffbolznv commited on Jul 6, 2025

vulkan: Handle updated FA dim2/3 definition (llama/14518)

d1e619e

jeffbolznv commited on Jul 5, 2025

opencl: add GELU_ERF (llama/14476)

b19d736

Sigbjørn Skjæret commited on Jul 5, 2025

metal : disable fast math in all quantize kernels (llama/14528)

df9d510

ggerganov commited on Jul 4, 2025

CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002)

b9b5859

luyhcsu luyuhong commited on Jul 4, 2025

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)

f798922

Sigbjørn Skjæret commited on Jul 3, 2025

opencl : broadcast for soft_max (llama/14510)

4434043

lhez commited on Jul 3, 2025

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)

90cefa0

jeffbolznv commited on Jul 3, 2025

ggml: backward pass for split swiglu (llama/14483)

45c8df6

JohannesGaessler commited on Jul 3, 2025

Fix conditional enabling following arch checks for ggml-sycl (llama/14504)

1f15602

Nicolò Scipione commited on Jul 3, 2025

kv-cache : use ggml_set_rows (llama/14285)

7d6d9e8

ggerganov commited on Jul 3, 2025

ggml : fix FA mask dim 2 and 3 (llama/14505)

a89dc81

ggerganov commited on Jul 3, 2025

CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497)

8e1f56c

am17an commited on Jul 2, 2025

llama : initial Mamba-2 support (llama/9126)

1b4087e

compilade commited on Jul 2, 2025

CUDA: add softmax broadcast (llama/14475)

05351ac

am17an commited on Jul 2, 2025

CUDA: broadcasting for FlashAttention mask (llama/14500)

47e02a8

JohannesGaessler commited on Jul 2, 2025

vulkan: support softmax/FA batch and broadcast (llama/14449)

f6b0b76

jeffbolznv commited on Jul 1, 2025

ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)

ebacb3e

ggerganov commited on Jul 12, 2025

opencl : fix possible buffer overflow in dump_tensor (llama/14490)

deb934d

jeffzhou2000 commited on Jul 2, 2025

opencl : skip empty nodes on cgraph compute (llama/14491)

5c36e7c

Eric Zhang commited on Jul 2, 2025

opencl : update upscale to support align corners (llama/14488)

2b95b05

lhez commited on Jul 2, 2025

ggml : Callback before abort (llama/14481)

ccee17d

Bytealyzer Diego Devesa commited on Jul 2, 2025

ci : disable fast-math for Metal GHA CI (llama/14478)

ec4b1b3

ggerganov commited on Jul 1, 2025

CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (llama/14411)

d8d5b0b

Chenguang Li commited on Jul 1, 2025

vulkan: Split large mul_mat_id to fit in shared memory (llama/14451)

bf678f0

jeffbolznv commited on Jul 1, 2025

add GELU_ERF (llama/14455)

235ebf7

Sigbjørn Skjæret commited on Jul 1, 2025

vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291)

666e65b

Acly commited on Jul 3, 2025

vulkan : implement ggml_roll (ggml/1290)

968f9e8

Acly commited on Jul 3, 2025

ggml : add version function to get lib version (ggml/1286)

880f633

danbev

ggerganov commited on Jul 2, 2025

server : add dtw.params for v3-large-turbo (#3307)

1250fd1
unverified

accessiblepixel commited on Jul 7, 2025

feat: support vad for addon.node (#3301)

f795870
unverified

Lin Xiaodong linxiaodong commited on Jul 2, 2025

Commit History

sync : ggml 116dcaa

sync : resolve conflicts (ggml/0) 497add0

vulkan: support SET_ROWS (llama/14587) 9821f43

vulkan: optimizations for deepseek prompt processing (llama/14555) 04b631e

model : support LiquidAI LFM2 hybrid family (llama/14620) 07ff90a

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634) 4354560

opencl: add tiled mul_mat_f16_f32 (llama/14535) 398dc49

opencl: add `set_rows` for `f16` and `f32` (llama/14547) 5e203ec

SYCL: Initial set_rows kernel implementation (llama/14562) e62ef85

cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602) 92b2d32

ggml : add ggml_scale_bias (llama/14417) 573d50a

ggml : prevent integer overflow in gguf tensor size calculation (llama/14595) 31f34e7

vulkan: optimize flash attention split_k_reduce (llama/14554) 45fbb42

vulkan : fix rope with partial rotation and non-cont src (llama/14582) 367fa85

cuda : fix rope with partial rotation and non-cont src (llama/14580) aaf2d96

CUDA: add bilinear interpolation for upscale (llama/14563) 68ded09

musa: fix build warnings (unused variable) (llama/14561) 891b1d1

CUDA: add bf16 and i32 to getrows (llama/14529) 014494c

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485) effd61f

vulkan: fix rms_norm+mul fusion (llama/14545) 0791e65

vulkan: Handle updated FA dim2/3 definition (llama/14518) d1e619e

opencl: add GELU_ERF (llama/14476) b19d736

metal : disable fast math in all quantize kernels (llama/14528) df9d510

CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002) b9b5859

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445) f798922

opencl : broadcast for soft_max (llama/14510) 4434043

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509) 90cefa0

ggml: backward pass for split swiglu (llama/14483) 45c8df6

Fix conditional enabling following arch checks for ggml-sycl (llama/14504) 1f15602

kv-cache : use ggml_set_rows (llama/14285) 7d6d9e8

ggml : fix FA mask dim 2 and 3 (llama/14505) a89dc81

CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497) 8e1f56c

llama : initial Mamba-2 support (llama/9126) 1b4087e

CUDA: add softmax broadcast (llama/14475) 05351ac

CUDA: broadcasting for FlashAttention mask (llama/14500) 47e02a8

vulkan: support softmax/FA batch and broadcast (llama/14449) f6b0b76

ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435) ebacb3e

opencl : fix possible buffer overflow in dump_tensor (llama/14490) deb934d

opencl : skip empty nodes on cgraph compute (llama/14491) 5c36e7c

opencl : update upscale to support align corners (llama/14488) 2b95b05

ggml : Callback before abort (llama/14481) ccee17d

ci : disable fast-math for Metal GHA CI (llama/14478) ec4b1b3

CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (llama/14411) d8d5b0b

vulkan: Split large mul_mat_id to fit in shared memory (llama/14451) bf678f0

add GELU_ERF (llama/14455) 235ebf7

vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291) 666e65b

vulkan : implement ggml_roll (ggml/1290) 968f9e8

ggml : add version function to get lib version (ggml/1286) 880f633

server : add dtw.params for v3-large-turbo (#3307) 1250fd1 unverified

feat: support vad for addon.node (#3301) f795870 unverified

sync : ggml

116dcaa

sync : resolve conflicts (ggml/0)

497add0

vulkan: support SET_ROWS (llama/14587)

9821f43

vulkan: optimizations for deepseek prompt processing (llama/14555)

04b631e

model : support LiquidAI LFM2 hybrid family (llama/14620)

07ff90a

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634)

4354560

opencl: add tiled mul_mat_f16_f32 (llama/14535)

398dc49

opencl: add `set_rows` for `f16` and `f32` (llama/14547)

5e203ec

SYCL: Initial set_rows kernel implementation (llama/14562)

e62ef85

cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602)

92b2d32

ggml : add ggml_scale_bias (llama/14417)

573d50a

ggml : prevent integer overflow in gguf tensor size calculation (llama/14595)

31f34e7

vulkan: optimize flash attention split_k_reduce (llama/14554)

45fbb42

vulkan : fix rope with partial rotation and non-cont src (llama/14582)

367fa85

cuda : fix rope with partial rotation and non-cont src (llama/14580)

aaf2d96

CUDA: add bilinear interpolation for upscale (llama/14563)

68ded09

musa: fix build warnings (unused variable) (llama/14561)

891b1d1

CUDA: add bf16 and i32 to getrows (llama/14529)

014494c

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485)

effd61f

vulkan: fix rms_norm+mul fusion (llama/14545)

0791e65

vulkan: Handle updated FA dim2/3 definition (llama/14518)

d1e619e

opencl: add GELU_ERF (llama/14476)

b19d736

metal : disable fast math in all quantize kernels (llama/14528)

df9d510

CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002)

b9b5859

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)

f798922

opencl : broadcast for soft_max (llama/14510)

4434043

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)

90cefa0

ggml: backward pass for split swiglu (llama/14483)

45c8df6

Fix conditional enabling following arch checks for ggml-sycl (llama/14504)

1f15602

kv-cache : use ggml_set_rows (llama/14285)

7d6d9e8

ggml : fix FA mask dim 2 and 3 (llama/14505)

a89dc81

CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497)

8e1f56c

llama : initial Mamba-2 support (llama/9126)

1b4087e

CUDA: add softmax broadcast (llama/14475)

05351ac

CUDA: broadcasting for FlashAttention mask (llama/14500)

47e02a8

vulkan: support softmax/FA batch and broadcast (llama/14449)

f6b0b76

ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)

ebacb3e

opencl : fix possible buffer overflow in dump_tensor (llama/14490)

deb934d

opencl : skip empty nodes on cgraph compute (llama/14491)

5c36e7c

opencl : update upscale to support align corners (llama/14488)

2b95b05

ggml : Callback before abort (llama/14481)

ccee17d

ci : disable fast-math for Metal GHA CI (llama/14478)

ec4b1b3

CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (llama/14411)

d8d5b0b

vulkan: Split large mul_mat_id to fit in shared memory (llama/14451)

bf678f0

add GELU_ERF (llama/14455)

235ebf7

vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291)

666e65b

vulkan : implement ggml_roll (ggml/1290)

968f9e8

ggml : add version function to get lib version (ggml/1286)

880f633

server : add dtw.params for v3-large-turbo (#3307)

1250fd1
unverified

feat: support vad for addon.node (#3301)

f795870
unverified