Spaces:
Running
Running
Commit History
ggml : fix quantized cpy op (llama/12310)
608b377
llama: Add support for RWKV v7 architecture (llama/12412)
727de7e
examples : command.wasm updates (#2904)
0db3249
unverified
ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154)
05466a9
Rémy O
commited on
ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)
c9a49f9
vmobilis
commited on
ggml : portability fixes for VS 2017 (llama/12150)
49e3343
mgroeber9110
Marcus Groeber
commited on
cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
67e8c32
cmdr2
commited on
Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)
2b94a24
cmdr2
commited on
ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)
4aa54ec
Aaron Teo
Jinyang He
junchao-zhao
commited on
ggml-cpu: Add CPU backend support for KleidiAI library (llama/11390)
9de6d81
Charles Xu
commited on
repo : update links to new url (llama/11886)
9705bb5
ggml-cpu : add chunking support to mul_mat_id (llama/11666)
e59d9a7
Diego Devesa
commited on
ggml : fix multi-threaded clamp_f32 (llama/11824)
1b1d6a8
Richard
commited on
ggml: Fix data race in ggml threadpool (llama/11736)
5554d5f
Karol Kontny
commited on
ggml : optimize and build warning fix for LoongArch (llama/11709)
b82d241
Jinyang He
commited on
CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
855a9fe
vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)
3bb9e77
CUDA: backwards pass for misc. ops, add tests (llama/11257)
2fbcec1
RoPE: fix back, CUDA support for back + noncont. (llama/11240)
131a21e
ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. (ggml/1065)
8e57313
issixx
issi
commited on
ggml : more perfo with llamafile tinyblas on x86_64 (llama/10714)
b284406
Djip007
commited on
ggml : fix const usage in SSE path (llama/10962)
38e6172
Diego Devesa
commited on
ggml : Fix compilation issues on ARM platform when building without fp16 (llama/10811)
f76ba41
Karol Kontny
commited on
remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (llama/10797)
b38cecf
Diego Devesa
commited on
ggml : refactor online repacking (llama/10446)
163128e
ggml : add predefined list of CPU backend variants to build (llama/10626)
1794b43
Diego Devesa
commited on
ggml-cpu : fix HWCAP2_I8MM value (llama/10646)
b3e6ea8
Diego Devesa
commited on
ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037)
dd775d5
ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
154bbc0
ggml : move AMX to the CPU backend (llama/10570)
3732429
Diego Devesa
commited on
ggml : fix I8MM Q4_1 scaling factor conversion (llama/10562)
664be9a
ggml : fix row condition for i8mm kernels (llama/10561)
01c713f
ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)
bf73242
ggml : add support for dynamic loading of backends (llama/10469)
b73266f
ggml : do not use ARM features not included in the build (llama/10457)
0001327
Diego Devesa
commited on
ggml : fix undefined reference to 'getcpu' (llama/10354)
2f9b147
FirstTimeEZ
commited on
ggml: new optimization interface (ggml/988)
dd33ace
AVX BF16 and single scale quant optimizations (llama/10212)
e6ffed3
Eve
commited on
backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921)
3541ee8
Charles Xu
Diego Devesa
commited on