Commit History

metal : improve FA + improve MoE (llama/12612)
04a3389

ggerganov commited on

ggml : fix quantized cpy op (llama/12310)
608b377

ggerganov commited on

llama: Add support for RWKV v7 architecture (llama/12412)
727de7e

mollysama commited on

examples : command.wasm updates (#2904)
0db3249
unverified

danbev commited on

ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154)
05466a9

Rémy O commited on

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)
c9a49f9

vmobilis commited on

ggml : portability fixes for VS 2017 (llama/12150)
49e3343

mgroeber9110 Marcus Groeber commited on

cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
67e8c32

cmdr2 commited on

Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)
2b94a24

cmdr2 commited on

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)
4aa54ec

Aaron Teo Jinyang He junchao-zhao commited on

ggml-cpu: Add CPU backend support for KleidiAI library (llama/11390)
9de6d81

Charles Xu commited on

repo : update links to new url (llama/11886)
9705bb5

ggerganov commited on

ggml-cpu : add chunking support to mul_mat_id (llama/11666)
e59d9a7

Diego Devesa commited on

ggml : fix multi-threaded clamp_f32 (llama/11824)
1b1d6a8

Richard commited on

ggml: Fix data race in ggml threadpool (llama/11736)
5554d5f

Karol Kontny commited on

ggml : optimize and build warning fix for LoongArch (llama/11709)
b82d241

Jinyang He commited on

CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
855a9fe

JohannesGaessler commited on

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)
3bb9e77

jeffbolznv commited on

CUDA: backwards pass for misc. ops, add tests (llama/11257)
2fbcec1

JohannesGaessler commited on

RoPE: fix back, CUDA support for back + noncont. (llama/11240)
131a21e

JohannesGaessler commited on

ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. (ggml/1065)
8e57313

issixx issi commited on

ggml : more perfo with llamafile tinyblas on x86_64 (llama/10714)
b284406

Djip007 commited on

ggml : fix const usage in SSE path (llama/10962)
38e6172

Diego Devesa commited on

llama : add Qwen2VL support + multimodal RoPE (llama/10361)
219d12b

RzZ ggerganov commited on

ggml : Fix compilation issues on ARM platform when building without fp16 (llama/10811)
f76ba41

Karol Kontny commited on

remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (llama/10797)
b38cecf

Diego Devesa commited on

ggml : refactor online repacking (llama/10446)
163128e

Djip007 ggerganov commited on

ggml : add predefined list of CPU backend variants to build (llama/10626)
1794b43

Diego Devesa commited on

ggml-cpu : fix HWCAP2_I8MM value (llama/10646)
b3e6ea8

Diego Devesa commited on

ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037)
dd775d5

PABannier commited on

ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
154bbc0

PABannier commited on

ggml : move AMX to the CPU backend (llama/10570)
3732429

Diego Devesa commited on

ggml : fix I8MM Q4_1 scaling factor conversion (llama/10562)
664be9a

ggerganov commited on

ggml : fix row condition for i8mm kernels (llama/10561)
01c713f

ggerganov commited on

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)
bf73242

shupeif commited on

ggml : add support for dynamic loading of backends (llama/10469)
b73266f

Diego Devesa ggerganov commited on

ggml : do not use ARM features not included in the build (llama/10457)
0001327

Diego Devesa commited on

ggml : fix undefined reference to 'getcpu' (llama/10354)
2f9b147

FirstTimeEZ commited on

ggml: new optimization interface (ggml/988)
dd33ace

JohannesGaessler commited on

AVX BF16 and single scale quant optimizations (llama/10212)
e6ffed3

Eve commited on

backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921)
3541ee8

Charles Xu Diego Devesa commited on

ggml : build backends as libraries (llama/10256)
3dc93f3

Diego Devesa ggerganov R0CKSTAR commited on