whisper.cpp / ggml /src /ggml-cpu

Commit History

metal : improve FA + improve MoE (llama/12612)
04a3389

ggerganov commited on

llamafile : ppc64le GEMV forwarding for FP32. (llama/12594)
1843f18

amritahs-ibm commited on

ggml : sync/merge cmake,riscv,powerpc, add common.cmake (ggml/0)
f695cbf

ggerganov commited on

llamafile : ppc64le MMA implementation for Q4_0. (llama/12489)
d154905

amritahs-ibm commited on

ggml : fix MUL_MAT_ID repack with Q8_K (llama/12544)
a13f78c

ggerganov commited on

ggml-cpu : update KleidiAI to v1.5.0 (llama/12568)
9b4460a

Dan Johansson commited on

ggml : fix quantized cpy op (llama/12310)
608b377

ggerganov commited on

ggml : block interleaving support for Q4_K quantization for x86 AVX2 architecture (llama/12332)
0729506

Srihari-mcw commited on

ggml : add SVE support for q6_K_q8_K (llama/12361)
607a196

fj-y-saito commited on

llama: Add support for RWKV v7 architecture (llama/12412)
727de7e

mollysama commited on

cmake: Enable specifying exact PowerPC CPU architecture (ggml/1138)
aac4d16

Christian Kastner commited on

examples : command.wasm updates (#2904)
0db3249
unverified

danbev commited on

ggml-cpu: faster AVX2 variant for IQ1_M (llama/12216)
591cbfb

Rémy O commited on

ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154)
05466a9

Rémy O commited on

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)
c9a49f9

vmobilis commited on

ggml : portability fixes for VS 2017 (llama/12150)
49e3343

mgroeber9110 Marcus Groeber commited on

ggml : fix kleidiai build (llama/12159)
dbc0180

ag2s20150909 commited on

ggml : upgrade init_tensor API to return a ggml_status (llama/11854)
d6b6852

William Tambellini slaren commited on

ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (llama/12064)
459beb1

Prashant Vithule vithulep commited on

ggml-cpu: Fix build with sve (llama/12059)
4be146e

mollysama commited on

cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
67e8c32

cmdr2 commited on

Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)
2b94a24

cmdr2 commited on

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)
4aa54ec

Aaron Teo Jinyang He junchao-zhao commited on

ggml-cpu: Add CPU backend support for KleidiAI library (llama/11390)
9de6d81

Charles Xu commited on

ggml: aarch64: implement SVE kernels for q3_K_q8_K vector dot (llama/11917)
1a1acd2

Prashant Vithule vithulep ggerganov commited on

repo : update links to new url (llama/11886)
9705bb5

ggerganov commited on

ggml: optimize some vec dot functions for LoongArch ASX (llama/11842)
e3acbfc

Jinyang He commited on

llamafile: use member variable instead of constant for iq4nlt (llama/11780)
0cb2d04

jmorganca commited on

ggml-cpu : add chunking support to mul_mat_id (llama/11666)
e59d9a7

Diego Devesa commited on

ggml : x2 speed for WASM by optimizing SIMD (llama/11453)
464a186

Xuan-Son Nguyen camel-cdr commited on

ggml : fix multi-threaded clamp_f32 (llama/11824)
1b1d6a8

Richard commited on

ggml-cpu: Fix duplicate MATMUL_INT8 (llama/11817)
05b9e78

ownia commited on

Fix #11802: Compile bug - RegQueryValueExA changed to RegQueryValueEx (llama/11803)
86969ac

Sheldon Robinson commited on

ggml: Fix data race in ggml threadpool (llama/11736)
5554d5f

Karol Kontny commited on

ggml : optimize and build warning fix for LoongArch (llama/11709)
b82d241

Jinyang He commited on

ggml : fix LoongArch compile error with 128-bit SIMD (llama/11701)
f7296aa

junchao-zhao commited on

cmake : fix compile assumptions for power9/etc (#2777)
4683df3
unverified

midnight midnight commited on

CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
855a9fe

JohannesGaessler commited on

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)
3bb9e77

jeffbolznv commited on

CUDA: backwards pass for misc. ops, add tests (llama/11257)
2fbcec1

JohannesGaessler commited on

ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (llama/11227)
bf3dc93

fj-y-saito ggerganov commited on

RoPE: fix back, CUDA support for back + noncont. (llama/11240)
131a21e

JohannesGaessler commited on

ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. (ggml/1065)
8e57313

issixx issi commited on

llamafile : ppc64le MMA INT8 implementation (llama/10912)
6f18eed

amritahs-ibm commited on

ggml-backend : only offload from host buffers (fix) (llama/11124)
9ac3c7e

Diego Devesa commited on

ggml : fixes for AVXVNNI instruction set with MSVC and Clang (llama/11027)
d13ac16

Srihari-mcw slaren commited on

ggml : more perfo with llamafile tinyblas on x86_64 (llama/10714)
b284406

Djip007 commited on

ggml : use wstring for backend search paths (llama/10960)
656e8b1

Diego Devesa commited on

ggml : fix arm enabled features check (llama/10961)
06cddad

Diego Devesa commited on