Spaces:
Running
Running
Commit History
scripts : sync-ggml-am.sh add option to skip commits
c34dd82
unverified
talk-llama : sync llama.cpp
b9d2bd9
unverified
sync : ggml
18bfc83
unverified
examples : adapt to metal API
b65decb
unverified
ggml: cache sin/cos for RoPE (llama/4908)
c315fbf
unverified
metal : remove old API (llama/4919)
d6abb6a
unverified
metal : disable log for loaded kernels (llama/4794)
2305485
unverified
gguf : fix potential infinite for-loop (llama/4600)
0e93179
unverified
texmex76
Bernhard Gstrein
commited on
metal : refactor kernel loading code (llama/4794)
53e6bf8
unverified
CUDA: faster q8_0 -> f16 dequantization (llama/4895)
0a1a178
unverified
talk-llama : add optional CLI arg to set the bot name (#1764)
63c8089
unverified
RhinoDevel
commited on
examples : add python example for transcription (#1744)
d600e4c
unverified
whisper : load the model into multiple buffers of max size 1GB (#1763)
0e9101f
unverified
talk-llama : sync llama.cpp
75c5f9c
unverified
sync : ggml
2ed0a44
unverified
backend_sched : fix assignments
cb91db5
unverified
slaren
commited on
llama : ggml-backend integration (llama/4766)
362430b
unverified
CUDA: fix softmax compile for old CUDA versions (llama/4862)
5eda533
unverified
models : make all scripts to be POSIX Compliant (#1725)
f7aef3e
unverified
ggml : fix 32-bit ARM compat for IQ2_XS (#1758)
d5836c9
unverified
go : add SetInitialPrompt method to bindings (#1753)
5fd6678
unverified
server : add more parameters to server api (#1754)
cb0cf7b
unverified
George Hindle
commited on
whisper : fix segment length with params.no_timestamps == true
720d738
unverified
params : don't compute timestamps when not printing them (#1755)
251825e
unverified
George Hindle
commited on
talk-llama : sync llama.cpp
f33490f
unverified
swift : remove local ggml.h reference
98b68e8
unverified
swift : track ggml release branch
ece2b9d
unverified
sync : ggml
9af4c11
unverified
sync : llama.cpp
569565f
unverified
ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)
5e827d5
unverified
metal : put encoder debug group behind a define (llama/4873)
6e822b8
unverified
Paul Tsochantaris
commited on
metal : improve dequantize precision to match CPU (llama/4836)
f2da2a4
unverified
ggml : fix vld1q_s8_x4 32-bit compat (llama/4828)
efed5ba
unverified
CUDA: faster softmax via shared memory + fp16 math (llama/4742)
52c45b9
unverified
metal : fix deprecation warning (ggml/690)
b1e29bc
unverified
ggml : remove ggml_cpy_inplace and ggml_cont_inplace (ggml/693)
6469bfe
unverified
Timothy Cronin
commited on
metal : wrap each operation in debug group (ggml/690)
b5e360f
unverified
Jack Mousseau
commited on
ggml : change GGML_MAX_NAME at compile time (ggml/682)
ded2b1a
unverified
Fix execlp call (ggml/689)
abda16e
unverified
Halalaluyafail3
commited on
SOTA 2-bit quants (llama/4773)
75de5bf
unverified
CUDA: fixed redundant value dequantization (llama/4809)
70c8d60
unverified
ggml : use __builtin_amdgcn_sudot4 in __dp4a for gfx11 (llama/4787)
f391d7a
unverified
Konstantin Zhuravlyov
commited on
ggml : do not sched_yield when calling BLAS (llama/4761)
5d1dffc
unverified
ggml : include stdlib.h before intrin.h (llama/4736)
743cace
unverified
swift : checkout ggml commit instead of branch (#1750)
6ab88cc
unverified
Alexandru Mariuti
commited on
talk-llama : add optional Piper TTS support (#1749)
fb92e62
unverified
RhinoDevel
commited on