Commits · Xenobd/whisper.cpp

ggml : add AMX backend (llama/8998)

1152a79

mingfeima commited on Oct 26, 2024

metal : support permuted matrix multiplicaions (llama/10033)

efb86a3

ggerganov commited on Oct 25, 2024

CUDA: fix insufficient buffer clearing for MMQ (llama/10032)

a41f94c

JohannesGaessler commited on Oct 24, 2024

CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021)

bcbaad3

JohannesGaessler commited on Oct 24, 2024

increase cuda_cpy block size (ggml/996)

60f512e

bssrdf bssrdf commited on Oct 23, 2024

metal : add POOL2D and fix IM2COL (llama/9943)

b553b89

newfrisbie commited on Oct 23, 2024

Adapt to dynamically loadable backends mechanism (llama/9970)

f8d4728

leo-pony commited on Oct 22, 2024

ggml : add asserts for type conversion in fattn kernels (llama/9971)

9542e42

ggerganov commited on Oct 21, 2024

rpc : pack only RPC structs (llama/9959)

6bdbd69

rgerganov commited on Oct 21, 2024

fix mul_mat_vec_q and *_vec_q error (llama/9939)

691e6ac

Neo Zhang Jianyu arthw commited on Oct 21, 2024

rpc : backend refactoring (llama/9912)

b6c412f

rgerganov commited on Oct 18, 2024

Add SYCL Backend registry, device and Event Interfaces (llama/9705)

f35cae5

Ouadie EL FAROUKI commited on Oct 18, 2024

add amx kernel for gemm (llama/8998)

db52137

mingfeima commited on Oct 18, 2024

vulkan : add backend registry / device interfaces (llama/9721)

df2cb6e

Diego Devesa commited on Oct 17, 2024

fix: allocating CPU buffer with size `0` (llama/9917)

ae9a15f

Gilad S commited on Oct 16, 2024

fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875)

cf75979

Gilad S commited on Oct 16, 2024

CUDA: fix 1D im2col, add tests (ggml/993)

c24f7b1

JohannesGaessler commited on Oct 18, 2024

Fix cann compilation error (llama/9891)

b480790

leo-pony commited on Oct 16, 2024

Vectorize load instructions in dmmv f16 CUDA kernel (llama/9816)

ddb0222

agray3

JohannesGaessler commited on Oct 14, 2024

ggml : move more prints to the ggml log system (llama/9839)

98d1a6a

Diego Devesa commited on Oct 11, 2024

rpc : add backend registry / device interfaces (llama/9812)

4ac768e

Diego Devesa commited on Oct 10, 2024

musa: add docker image support (llama/9685)

553b278

R0CKSTAR commited on Oct 10, 2024

ggml : fix BLAS with unsupported types (llama/9775)

0a93e1b

Diego Devesa commited on Oct 8, 2024

ggml : add backend registry / device interfaces to BLAS backend (llama/9752)

7f269bb

Diego Devesa commited on Oct 7, 2024

Update building for Android (llama/9672)

27e2fca

Andrew Minh Nguyen commited on Oct 7, 2024

ggml : add metal backend registry / device (llama/9713)

b6adf19

ggerganov slaren commited on Oct 7, 2024

metal : single allocation of encode_async block (llama/9747)

6e1b44c

Paul Tsochantaris

ggerganov commited on Oct 7, 2024

ggml-alloc : remove buffer_id from leaf_alloc (ggml/987)

1a776cc

danbev commited on Oct 9, 2024

ggml : alloc ggml_contexts on the heap (#2525)

3ccf40a
unverified

ggerganov commited on Oct 31, 2024

vulkan : retry allocation with fallback flags (#2451)

9e91cbc
unverified

SRHMorris

fdsffdsafds commited on Oct 6, 2024

metal : zero-init buffer contexts (#0)

d651546

ggerganov commited on Oct 5, 2024

whisper : adapt to latest ggml (skip) (#0)

ad9dd7b

ggerganov commited on Oct 5, 2024

ggml : fix typo in example usage ggml_gallocr_new (ggml/984)

30a097b

danbev commited on Oct 4, 2024

ggml : fixes after sync (ggml/983)

237c05a

Diego Devesa commited on Oct 4, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

9d74d85

Diego Devesa commited on Oct 3, 2024

Fixed dequant precision issues in Q4_1 and Q5_1 (llama/9711)

5239c28

Ouadie EL FAROUKI commited on Oct 3, 2024

ggml-backend : add device and backend reg interfaces (llama/9707)

1bdb50a

Diego Devesa

JohannesGaessler commited on Oct 2, 2024

Initial cmake support of SYCL for AMD GPUs (llama/9658)

7d7ac98

Alberto Cabrera Pérez commited on Oct 2, 2024

vulkan : do not use tensor->extra (llama/9407)

7d66a68

rgerganov

OccamRazor commited on Oct 2, 2024

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)

52069b8

JohannesGaessler commited on Oct 3, 2024

ggml: refactor cross entropy loss CPU impl. (ggml/976)

2a0805f

JohannesGaessler commited on Oct 2, 2024

metal : reduce command encoding overhead (llama/9698)

43d5a06

ggerganov commited on Oct 2, 2024

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)

76aa810

JohannesGaessler commited on Sep 30, 2024

vulkan : mul_mat: fix UB with small warps (ggml/952)

d1a29c6

smeso commited on Sep 30, 2024

ggml : fix ggml_cast (ggml/973)

c44d575

stanimirovb commited on Sep 30, 2024

ggml: fix gradient allocation logic (ggml/966)

ad3f29d

JohannesGaessler commited on Sep 29, 2024

ggml : define missing HWCAP flags (llama/9684)

1d52105

ggerganov Willy Tarreau commited on Sep 29, 2024

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

12c0e23

Dan Johansson commited on Sep 28, 2024

Enable use to the rebar feature to upload buffers to the device. (llama/9251)

760f8c2

Markus Tavenrath commited on Sep 28, 2024

mtgpu: enable VMM (llama/9597)

e84b4f5

R0CKSTAR commited on Sep 26, 2024

Commit History

ggml : add AMX backend (llama/8998) 1152a79

metal : support permuted matrix multiplicaions (llama/10033) efb86a3

CUDA: fix insufficient buffer clearing for MMQ (llama/10032) a41f94c

CUDA: fix MMQ for non-contiguous src0, add tests (llama/10021) bcbaad3

increase cuda_cpy block size (ggml/996) 60f512e

metal : add POOL2D and fix IM2COL (llama/9943) b553b89

Adapt to dynamically loadable backends mechanism (llama/9970) f8d4728

ggml : add asserts for type conversion in fattn kernels (llama/9971) 9542e42

rpc : pack only RPC structs (llama/9959) 6bdbd69

fix mul_mat_vec_q and *_vec_q error (llama/9939) 691e6ac

rpc : backend refactoring (llama/9912) b6c412f

Add SYCL Backend registry, device and Event Interfaces (llama/9705) f35cae5

add amx kernel for gemm (llama/8998) db52137

vulkan : add backend registry / device interfaces (llama/9721) df2cb6e

fix: allocating CPU buffer with size `0` (llama/9917) ae9a15f

fix: use `vm_allocate` to allocate CPU backend buffer on macOS (llama/9875) cf75979

CUDA: fix 1D im2col, add tests (ggml/993) c24f7b1

Fix cann compilation error (llama/9891) b480790

Vectorize load instructions in dmmv f16 CUDA kernel (llama/9816) ddb0222

ggml : move more prints to the ggml log system (llama/9839) 98d1a6a

rpc : add backend registry / device interfaces (llama/9812) 4ac768e

musa: add docker image support (llama/9685) 553b278

ggml : fix BLAS with unsupported types (llama/9775) 0a93e1b

ggml : add backend registry / device interfaces to BLAS backend (llama/9752) 7f269bb

Update building for Android (llama/9672) 27e2fca

ggml : add metal backend registry / device (llama/9713) b6adf19

metal : single allocation of encode_async block (llama/9747) 6e1b44c

ggml-alloc : remove buffer_id from leaf_alloc (ggml/987) 1a776cc

ggml : alloc ggml_contexts on the heap (#2525) 3ccf40a unverified

vulkan : retry allocation with fallback flags (#2451) 9e91cbc unverified

metal : zero-init buffer contexts (#0) d651546

whisper : adapt to latest ggml (skip) (#0) ad9dd7b

ggml : fix typo in example usage ggml_gallocr_new (ggml/984) 30a097b

ggml : fixes after sync (ggml/983) 237c05a

ggml-backend : add device and backend reg interfaces (llama/9707) 9d74d85

Fixed dequant precision issues in Q4_1 and Q5_1 (llama/9711) 5239c28

ggml-backend : add device and backend reg interfaces (llama/9707) 1bdb50a

Initial cmake support of SYCL for AMD GPUs (llama/9658) 7d7ac98

vulkan : do not use tensor->extra (llama/9407) 7d66a68

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980) 52069b8

ggml: refactor cross entropy loss CPU impl. (ggml/976) 2a0805f

metal : reduce command encoding overhead (llama/9698) 43d5a06

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974) 76aa810

vulkan : mul_mat: fix UB with small warps (ggml/952) d1a29c6

ggml : fix ggml_cast (ggml/973) c44d575

ggml: fix gradient allocation logic (ggml/966) ad3f29d

ggml : define missing HWCAP flags (llama/9684) 1d52105

ggml : add run-time detection of neon, i8mm and sve (llama/9331) 12c0e23

Enable use to the rebar feature to upload buffers to the device. (llama/9251) 760f8c2

mtgpu: enable VMM (llama/9597) e84b4f5