Spaces:
Running
Running
Commit History
ci: Update build.yml to suppress warnings about node.js versions (#2166)
e9954d9
unverified
Tamotsu Takahashi
commited on
release : v1.6.0
d823237
unverified
whisper : use flash attention (#2152)
27c0a97
unverified
talk-llama : reject runs without required arguments (#2153)
b445508
unverified
sync : ggml
aac57a1
unverified
metal : support FA without mask + add asserts (llama/7278)
98ce302
unverified
ggml : add RPC backend (llama/6829)
5838a14
unverified
rm wait() (llama/7233)
328702a
unverified
Neo Zhang
commited on
CUDA: add FP32 FlashAttention vector kernel (llama/7188)
03d4b22
unverified
scripts : sync ggml-rpc
7b58c58
unverified
whisper : fix model path encoding in windows (#2086)
49f8792
unverified
thewh1teagle
commited on
server : return utf-8 (#2138)
2719aa0
unverified
cmake : fix HIP/ROCm build (#2102)
a90ae59
unverified
aldorof
commited on
node : add additional params (#2000)
933eb40
unverified
valVk
commited on
js : remove un-needed request header from fetchRemote (#2119)
6c54394
unverified
Mark Karpelès
commited on
cmake : fix metal embed sources path (#2110)
087b1a8
unverified
main : dont print timings with --no-prints (#2108)
685d1c1
unverified
Daniel Ziegenberg
commited on
main : add options for temperature control (#2088)
9a3f777
unverified
Daniel Ziegenberg
commited on
whisper : switch back to F32 mask (#0)
3b7b90c
unverified
whisper.android : update example, add field to print timestamp (#2072)
03fb680
unverified
cmake : fix json INTERFACE library (#2069)
0a1cadb
unverified
main : fix double quote escaping in csv output (#2090)
9952a85
unverified
mashizora
commited on
metal : tune soft_max number of threads (#0)
99d668a
whisper : remove old flash attn code (#0)
fd57e47
ggml : try fix ppc64 (#0)
df78c25
ggml : remove oboslete alibi code (skipme) (#0)
d25c1e3
talk-llama : sync llama.cpp
f5f68d6
sync : ggml
3ea4549
ggml : optimize for ppc64le using VSX intrinsics (ggml/784)
05d3824
metal : fix indent (ggml/0)
d4f82d5
ggml : restore sigmoid decl order (ggml/0)
67c5387
ggml : resolve merge (ggml/0)
d692b06
ggml : full ALiBi support (llama/7192)
192bda4
metal : fix flash attention kernel requirements (llama/7169)
6cb3028
Minor arithmetic improvement to mmvq wrapper kernel (llama/7172)
ae75124
Ouadie EL FAROUKI
commited on
Vulkan Bugfixes and Improvements (llama/7084)
8dade62
CUDA: generalize FP16 fattn vec kernel (llama/7061)
ca79691
opencl : alignment size converted from bits to bytes (llama/7090)
2692ce5
Introduction of CUDA Graphs to LLama.cpp (llama/6766)
08fc76d
agray3
slaren
commited on
metal : use `vm_allocate` instead of `posix_memalign` on macOS (llama/7078)
eb910b1
Gilad S
commited on
ggml : introduce bfloat16 support (llama/6412)
81ec961
Justine Tunney
commited on
metal : fix unused warning
24e883a
Add an option to build without CUDA VMM (llama/7067)
38b1143
gguf-split: add --no-tensor-first-split (llama/7072)
b9bc04d
Xuan Son Nguyen
commited on