go : add beamsize/entropythold/maxcontext to context interface (#2350) 7efcda7 unverified hsinhoyeh commited on Aug 28, 2024
ggml : do not crash when quantizing q4_x_x with an imatrix (llama/9192) d64f932 slaren commited on Aug 26, 2024
metal : separate scale and mask from QKT in FA kernel (llama/9189) 90cc3cd ggerganov commited on Aug 26, 2024
CPU/CUDA: Gemma 2 FlashAttention support (llama/8542) fb8ae8b JohannesGaessler commited on Aug 24, 2024
llama : simplify Mamba with advanced batch splits (llama/8526) f1abcb4 compilade ggerganov commited on Aug 21, 2024
Fix SYCL `im2col` and `convert` Overflow with Large Dims (llama/9052) 5f43886 zhentaoyu commited on Aug 20, 2024
rpc : print error message when failed to connect endpoint (llama/9042) d54b156 rgerganov commited on Aug 19, 2024
ggml : dynamic ggml_sched_max_splits based on graph_size (llama/9047) e0dc1ad nicoboss commited on Aug 16, 2024
Optimize Vulkan backend for better CPU performance and less GPU synchronization overhead. (llama/8943) 11bc9e6 Markus Tavenrath OccamRazor commited on Aug 11, 2024
feat: ref. cross entropy, add CUDA, fix grad test (ggml/929) e1e87a3 JohannesGaessler commited on Aug 27, 2024
models : add support for wget2 for fedora (#2387) 0653499 unverified Brad Murray commited on Aug 28, 2024
readme : fix broken links in implementation details section (#2382) 4863dee unverified stormofice commited on Aug 28, 2024
whisper : fix compile warning for unused params 0e05e03 unverified ggerganov commited on Aug 28, 2024
examples : use colorblind friendly TTY color scheme (#2360) 09303a2 unverified Justine Tunney commited on Aug 20, 2024
ggml : support forward pass broadcasting in ggml_sub (ggml/914) 0af2d37 unverified smeso commited on Aug 11, 2024
metal : fix uninitialized abort_callback (llama/8968) f971b60 unverified slaren commited on Aug 10, 2024
rpc : sanitize tensor data + warnings (llama/0) 87d58fe unverified ggerganov slaren commited on Aug 9, 2024