Spaces:
Running
Running
Commit History
sync : resolve conflicts (ggml/0)
497add0
vulkan: support SET_ROWS (llama/14587)
9821f43
vulkan: optimizations for deepseek prompt processing (llama/14555)
04b631e
model : support LiquidAI LFM2 hybrid family (llama/14620)
07ff90a
Tarek Dakhran
commited on
HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634)
4354560
Slobodan Josic
commited on
opencl: add tiled mul_mat_f16_f32 (llama/14535)
398dc49
opencl: add `set_rows` for `f16` and `f32` (llama/14547)
5e203ec
lhez
commited on
SYCL: Initial set_rows kernel implementation (llama/14562)
e62ef85
Akarshan Biswas
commited on
cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602)
92b2d32
ggml : add ggml_scale_bias (llama/14417)
573d50a
ggml : prevent integer overflow in gguf tensor size calculation (llama/14595)
31f34e7
vulkan: optimize flash attention split_k_reduce (llama/14554)
45fbb42
vulkan : fix rope with partial rotation and non-cont src (llama/14582)
367fa85
cuda : fix rope with partial rotation and non-cont src (llama/14580)
aaf2d96
CUDA: add bilinear interpolation for upscale (llama/14563)
68ded09
musa: fix build warnings (unused variable) (llama/14561)
891b1d1
CUDA: add bf16 and i32 to getrows (llama/14529)
014494c
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485)
effd61f
Eve
Rémy Oudompheng
commited on
vulkan: fix rms_norm+mul fusion (llama/14545)
0791e65
vulkan: Handle updated FA dim2/3 definition (llama/14518)
d1e619e
opencl: add GELU_ERF (llama/14476)
b19d736
Sigbjørn Skjæret
commited on
metal : disable fast math in all quantize kernels (llama/14528)
df9d510
CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002)
b9b5859
luyhcsu
luyuhong
commited on
ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922
Sigbjørn Skjæret
commited on
opencl : broadcast for soft_max (llama/14510)
4434043
lhez
commited on
vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)
90cefa0
ggml: backward pass for split swiglu (llama/14483)
45c8df6
Fix conditional enabling following arch checks for ggml-sycl (llama/14504)
1f15602
Nicolò Scipione
commited on
kv-cache : use ggml_set_rows (llama/14285)
7d6d9e8
ggml : fix FA mask dim 2 and 3 (llama/14505)
a89dc81
CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497)
8e1f56c
llama : initial Mamba-2 support (llama/9126)
1b4087e
CUDA: add softmax broadcast (llama/14475)
05351ac
CUDA: broadcasting for FlashAttention mask (llama/14500)
47e02a8
vulkan: support softmax/FA batch and broadcast (llama/14449)
f6b0b76
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)
ebacb3e
opencl : fix possible buffer overflow in dump_tensor (llama/14490)
deb934d
opencl : skip empty nodes on cgraph compute (llama/14491)
5c36e7c
Eric Zhang
commited on
opencl : update upscale to support align corners (llama/14488)
2b95b05
lhez
commited on
ggml : Callback before abort (llama/14481)
ccee17d
ci : disable fast-math for Metal GHA CI (llama/14478)
ec4b1b3
CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (llama/14411)
d8d5b0b
Chenguang Li
commited on
vulkan: Split large mul_mat_id to fit in shared memory (llama/14451)
bf678f0
add GELU_ERF (llama/14455)
235ebf7
Sigbjørn Skjæret
commited on
vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291)
666e65b
vulkan : implement ggml_roll (ggml/1290)
968f9e8
server : add dtw.params for v3-large-turbo (#3307)
1250fd1
unverified
accessiblepixel
commited on
feat: support vad for addon.node (#3301)
f795870
unverified
Lin Xiaodong
linxiaodong
commited on