Commit History

node : add flash_attn param (#2170)
b4d05df
unverified

pprobst commited on

ci: Update build.yml to suppress warnings about node.js versions (#2166)
e9954d9
unverified

Tamotsu Takahashi commited on

release : v1.6.0
d823237
unverified

ggerganov commited on

whisper : use flash attention (#2152)
27c0a97
unverified

ggerganov commited on

talk-llama : reject runs without required arguments (#2153)
b445508
unverified

petterreinholdtsen ggerganov commited on

sync : ggml
aac57a1
unverified

ggerganov commited on

metal : support FA without mask + add asserts (llama/7278)
98ce302
unverified

ggerganov commited on

ggml : add RPC backend (llama/6829)
5838a14
unverified

rgerganov commited on

rm wait() (llama/7233)
328702a
unverified

Neo Zhang commited on

CUDA: add FP32 FlashAttention vector kernel (llama/7188)
03d4b22
unverified

JohannesGaessler commited on

scripts : sync ggml-rpc
7b58c58
unverified

ggerganov commited on

whisper : fix model path encoding in windows (#2086)
49f8792
unverified

thewh1teagle commited on

server : return utf-8 (#2138)
2719aa0
unverified

ggerganov commited on

node : add audio_ctx and audio buffer params (#2123)
9b4d9d5
unverified

pprobst ggerganov commited on

cmake : fix HIP/ROCm build (#2102)
a90ae59
unverified

aldorof commited on

node : add additional params (#2000)
933eb40
unverified

valVk commited on

js : remove un-needed request header from fetchRemote (#2119)
6c54394
unverified

Mark Karpelès commited on

cmake : fix metal embed sources path (#2110)
087b1a8
unverified

ggerganov commited on

main : dont print timings with --no-prints (#2108)
685d1c1
unverified

Daniel Ziegenberg commited on

main : add options for temperature control (#2088)
9a3f777
unverified

Daniel Ziegenberg commited on

whisper : switch back to F32 mask (#0)
3b7b90c
unverified

ggerganov commited on

whisper.android : update example, add field to print timestamp (#2072)
03fb680
unverified

codezjx commited on

cmake : fix json INTERFACE library (#2069)
0a1cadb
unverified

xcsong commited on

main : fix double quote escaping in csv output (#2090)
9952a85
unverified

mashizora commited on

metal : tune soft_max number of threads (#0)
99d668a

ggerganov commited on

whisper : remove old flash attn code (#0)
fd57e47

ggerganov commited on

ggml : try fix ppc64 (#0)
df78c25

ggerganov commited on

ggml : remove oboslete alibi code (skipme) (#0)
d25c1e3

ggerganov commited on

talk-llama : sync llama.cpp
f5f68d6

ggerganov commited on

sync : ggml
3ea4549

ggerganov commited on

ggml : optimize for ppc64le using VSX intrinsics (ggml/784)
05d3824

Hong Bo PENG ggerganov commited on

metal : fix indent (ggml/0)
d4f82d5

ggerganov commited on

ggml : restore sigmoid decl order (ggml/0)
67c5387

ggerganov commited on

ggml : resolve merge (ggml/0)
d692b06

ggerganov commited on

ggml : full ALiBi support (llama/7192)
192bda4

ggerganov commited on

metal : fix flash attention kernel requirements (llama/7169)
6cb3028

ggerganov commited on

Minor arithmetic improvement to mmvq wrapper kernel (llama/7172)
ae75124

Ouadie EL FAROUKI commited on

Vulkan Bugfixes and Improvements (llama/7084)
8dade62

OccamRazor commited on

CUDA: generalize FP16 fattn vec kernel (llama/7061)
ca79691

JohannesGaessler commited on

opencl : alignment size converted from bits to bytes (llama/7090)
2692ce5

albertjin Cebtenzzre commited on

Introduction of CUDA Graphs to LLama.cpp (llama/6766)
08fc76d

agray3 slaren commited on

metal : use `vm_allocate` instead of `posix_memalign` on macOS (llama/7078)
eb910b1

Gilad S commited on

ggml : introduce bfloat16 support (llama/6412)
81ec961

Justine Tunney commited on

metal : fix unused warning
24e883a

ggerganov commited on

Add an option to build without CUDA VMM (llama/7067)
38b1143

wtambellini commited on

gguf-split: add --no-tensor-first-split (llama/7072)
b9bc04d

Xuan Son Nguyen commited on

CUDA: CUDART < 11.7 workaround for __hmax, __hmax2 (llama/7019)
4cf786d

JohannesGaessler commited on

switch to using localizedDescription (llama/7010)
fd25ba6

bakkot commited on

metal : remove deprecated error code (llama/7008)
42a84fb

ggerganov commited on

metal : log more info on error (llama/6987)
d4dcef9

bakkot commited on