Spaces:
Running
Running
Commit History
opencl : alignment size converted from bits to bytes (llama/7090) 2692ce5
Introduction of CUDA Graphs to LLama.cpp (llama/6766) 08fc76d
agray3 slaren commited on
metal : use `vm_allocate` instead of `posix_memalign` on macOS (llama/7078) eb910b1
Gilad S commited on
ggml : introduce bfloat16 support (llama/6412) 81ec961
Justine Tunney commited on
metal : fix unused warning 24e883a
Add an option to build without CUDA VMM (llama/7067) 38b1143
gguf-split: add --no-tensor-first-split (llama/7072) b9bc04d
Xuan Son Nguyen commited on
CUDA: CUDART < 11.7 workaround for __hmax, __hmax2 (llama/7019) 4cf786d
switch to using localizedDescription (llama/7010) fd25ba6
metal : remove deprecated error code (llama/7008) 42a84fb
metal : log more info on error (llama/6987) d4dcef9
ggml : add Flash Attention (llama/5021) 34d3b03
ggml : fix __MSC_VER -> _MSC_VER (llama/6977) a83f2ae
Fix more int overflow during quant (PPL/CUDA). (llama/6563) 531387f
gguf : enforce that tensor names are unique (llama/6905) 22e446d
Xuan Son Nguyen slaren commited on
add device version in device list (llama/6959) c022e9a
Neo Zhang arthw commited on
Reset schedule earlier to allow overlap with ggml graph computation on device (llama/6933) 3a8eea8
agray3 commited on
add basic tensor data validation function (llama/6884) 71e001c
slaren commited on
gguf : fix mismatch between alloc and free functions (llama/6929) d8fb433
slaren commited on
Merge pull request from GHSA-p5mv-gjc5-mwqv 72b368d
ggml : fix redefinition of vaddvq_f32 for 32-bit ARM (llama/6906) f900de6
ggml : fix MIN / MAX macros (llama/6904) a1c0e2a
ggml : move 32-bit arm compat in ggml-impl.h (llama/6865) 7343760
llamafile : improve sgemm.cpp (llama/6796) bfe2a5f
Justine Tunney commited on
ggml : fix calloc argument ordering. (llama/6820) 12af87c
Dave Airlie commited on
ggml : fix ggml_backend_cpu_supports_op() for CPY (llama/0) d645791
ggml : group all experts in a single ggml_mul_mat_id (llama/6505) f0b5c67
ggml : fix llamafile sgemm wdata offsets (llama/6710) 5e756db
ggml : add llamafile sgemm (llama/6414) 093eec4
Justine Tunney commited on
llama : add qwen2moe (llama/6074) daae175
fix mul_mat_id() for new input, make the ut pass (llama/6682) 6d1ba81
Neo Zhang Jianyu commited on
Added support for GGML_OP_CLAMP in Metal (llama/6662) a06cbc7
Dave dave-fl commited on
fix memcpy() crash, add missed cmd in guide, fix softmax (llama/6622) 6901743
Neo Zhang Jianyu commited on
CUDA: fix matrix multiplication logic for tests (llama/6667) 6ccb5a5
metal : unify mul_mv_id kernels (llama/6556) e9910b5
slaren commited on
llama : add gguf_remove_key + remove split meta during quantize (llama/6591) 1706870
jiez z5269887 commited on
feat: implemented sigmoid function (ggml/806) cd0c122
Justina Cho commited on
build: fix and ignore msvc warnings (ggml/805) c40b574
ggml : expose SSE3 and SSSE3 for MSVC when AVX is available (#2128) 340b9ae unverified
Przemysław Pawełczyk commited on
build : improve disabling AVX-512 (#2129) dd6f1ab unverified
Przemysław Pawełczyk commited on
minor: add CMakeSettings.json to gitignore (#2094) a361a80 unverified
examples : fix node compilation (#2115) 071e466 unverified
make : change GNU make default CXX from g++ to c++ (#2100) 610f480 unverified
Przemysław Pawełczyk commited on
Remove unnecessary memory reallocation in fft (#2080) 3198674 unverified
goldwaving commited on
models : disable old script (#2079) dfabe35 unverified
whisper : more prominent log message for sub-1s audio (#2065) 5ddb20b unverified
main : pass nullptr when regex is empty (#2070) 8677fc4 unverified
readme : add up-to-date repository for Python bindings (#2063) f573a31 unverified
AIWintermuteAI commited on