Spaces:
Running
Running
Commit History
talk-llama : sync llama.cpp 542accf unverified
make : fix CUBLAS link with WSL (#1878) b3c9e81 unverified
LBlue commited on
sync : ggml cb5b2be unverified
ggml : resolve merge conflicts (ggml/0) 7ee6ffa unverified
common : add IQ1_S (ggml/0) 39c054e unverified
ci : enable -Werror for CUDA builds (llama/5579) df03a10 unverified
cuda, metal : fix nans in soft_max (llama/5574) 44164ac unverified
ggml : android and old glibc NUMA incompatibility bugfixes (llama/5557) 0206c2d unverified
ggml : restore vec dot stride arg names (llama/5453) de4041f unverified
ci : fix wikitext url + compile warnings (llama/5569) 49f0106 unverified
metal : fix unused warnings (llama/0) d12cda5 unverified
ggml, common, examples, tests : fixed type arguments in printf (llama/5528) 2f3a004 unverified
1.5 bit quantization (llama/5453) 9c3aa6a unverified
ggml : add ALiBi support for ggml_soft_max_ext (llama/5488) 26c019a unverified
cmake : fix VULKAN and ROCm builds (llama/5525) ae570e4 unverified
ggml : add numa options (llama/5377) 7c952d2 unverified
cuda : print message when initialization fails (llama/5512) 1f047ca unverified
slaren commited on
vulkan: Find optimal memory type but with fallback (llama/5381) 24e2319 unverified
Early return for zero size calls to get_tensor. (llama/5482) f1f5c00 unverified
ggml-quants : fix compiler warnings (shadow variable) (llama/5472) e538f25 unverified
ggml-sycl: Replace 3d ops with macro (llama/5458) 12970f1 unverified
Abhilash Majumder commited on
build : update CBLAS flags + fix unused var warning (#0) 496c0f1 unverified
main : check if input files exist before proceeding (#1872) d625238 unverified
examples : clean up common code (#1871) da3cdf4 unverified
models : fix openvino setup info (#1874) 7d4b654 unverified
Jumper775 commited on
models : add update py requirements a60f965 unverified
swift : package no longer use ggml dependency (#1861) df6227e unverified
whisper : fix external encoder (#1860) 3538ca9 unverified
sync : ggml f0a0087 unverified
ggml-alloc : allocate all leafs as if they were inputs (ggml/731) a512417 unverified
slaren commited on
talk-llama : sync llama.cpp aa42df9 unverified
sync : ggml be7d266 unverified
ggml-backend : sync remnant 3f5165f unverified
CUDA: mul_mat_vec_q tiling, refactor mul mat logic (llama/5434) c0cfa9b unverified
vulkan: only use M-sized matmul on Apple GPUs (llama/5412) 350284e unverified
Sergio López commited on
ggml : fix compile warnings (unused vars) (llama/4966) 97fa2e3 unverified
ggml : add mmla kernels for quantized GEMM (llama/4966) 0d50a29 unverified
snadampal commited on
metal : use autoreleasepool to avoid memory leaks (llama/5437) c276f12 unverified
ggml-alloc : v3 (ggml/727) 5cffd6f unverified
slaren commited on
examples : added audio_ctx argument to main and server (#1857) 469988b unverified
metal : option to embed MSL source into compiled binary (#1842) a46b62a unverified
Didzis Gosko commited on
examples : initialize context params properly (#1852) 3443ee7 unverified
talk-llama : sync llama.cpp e6d6e1d unverified
sync : ggml 94800c5 unverified
src : relocate new backend sources 44cd2d4 unverified
ggml : fix `error C2078: too many initializers` for MSVC ARM64 (llama/5404) 8ebb36c unverified
Michael Podvitskiy commited on