Commits · Xenobd/whisper.cpp

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)

76aa810

JohannesGaessler commited on Sep 30, 2024

vulkan : mul_mat: fix UB with small warps (ggml/952)

d1a29c6

smeso commited on Sep 30, 2024

ggml : fix ggml_cast (ggml/973)

c44d575

stanimirovb commited on Sep 30, 2024

ggml: fix gradient allocation logic (ggml/966)

ad3f29d

JohannesGaessler commited on Sep 29, 2024

ggml : define missing HWCAP flags (llama/9684)

1d52105

ggerganov Willy Tarreau commited on Sep 29, 2024

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

12c0e23

Dan Johansson commited on Sep 28, 2024

Enable use to the rebar feature to upload buffers to the device. (llama/9251)

760f8c2

Markus Tavenrath commited on Sep 28, 2024

mtgpu: enable VMM (llama/9597)

e84b4f5

R0CKSTAR commited on Sep 26, 2024

ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (llama/9217)

50395aa

Charles Xu commited on Sep 25, 2024

cann: fix crash when llama-bench is running on multiple cann devices (llama/9627)

068c697

dou112 commited on Sep 25, 2024

CUDA: remove bad assert (ggml/972)

91954a7

JohannesGaessler commited on Sep 29, 2024

vulkan : multithread pipeline creation (ggml/963)

ba60f98

jeffbolznv commited on Sep 29, 2024

vulkan : fix build for GGML_VULKAN_RUN_TESTS, add TFLOPS to log (ggml/961)

85e2387

jeffbolznv commited on Sep 27, 2024

vulkan : argsort barriers must be under uniform control flow (ggml/951)

b2602d7

smeso commited on Sep 26, 2024

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969)

ad34655

ggerganov commited on Sep 24, 2024

server : ffmpeg overwrite leftover temp file (#2431)

2dafb8e
unverified

dynafire commited on Oct 2, 2024

whisper : add large-v3-turbo (#2440)

f3283ba
unverified

ggerganov commited on Oct 1, 2024

tests : remove test-backend-ops (#2434)

050ba38
unverified

ggerganov commited on Sep 27, 2024

ci : disable failing CUDA and Java builds

ecef312
unverified

ggerganov commited on Sep 25, 2024

readme : fix references to download-ggml-model.sh (#2427)

3d92452
unverified

Hugo commited on Sep 24, 2024

make : remove "talk" target until updated

5fb8fce

ggerganov commited on Sep 24, 2024

ggml : add ggml-cpu-impl.h (skip) (#0)

958f2d3

ggerganov commited on Sep 24, 2024

sync : ggml

e22e2f8

ggerganov commited on Sep 24, 2024

talk-llama : sync llama.cpp

f91f98d

ggerganov commited on Sep 24, 2024

ggml : add AVX512DQ requirement for AVX512 builds (llama/9622)

14b5848

Eric Zhang commited on Sep 24, 2024

log : add CONT level for continuing previous log entry (llama/9610)

a29a4c5

ggerganov commited on Sep 24, 2024

threads: fix msvc build without openmp (llama/9615)

97b3eb5

Max Krasnyansky commited on Sep 24, 2024

cuda: add q8_0->f32 cpy operation (llama/9571)

6201c74

Nekotekina commited on Sep 24, 2024

threads: improve ggml_barrier scaling with large number of threads (llama/9598)

aca04d5

Max Krasnyansky commited on Sep 23, 2024

ggml : AVX512 gemm for Q4_0_8_8 (llama/9532)

7349efc

Srihari-mcw

ggerganov commited on Sep 23, 2024

metal : use F32 prec for K*Q in vec FA (llama/9595)

99c4239

ggerganov commited on Sep 23, 2024

Revert "[SYCL] fallback mmvq (ggml/9088)" (llama/9579)

5aceb3d

Akarshan Biswas commited on Sep 23, 2024

musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (llama/9526)

8ec75c3

R0CKSTAR commited on Sep 22, 2024

Fix merge error in #9454 (llama/9589)

3142fa9

mollysama commited on Sep 22, 2024

CUDA: enable Gemma FA for HIP/Pascal (llama/9581)

97cb7ce

JohannesGaessler commited on Sep 22, 2024

RWKV v6: RWKV_WKV op CUDA implementation (llama/9454)

8d3e707

mollysama commited on Sep 22, 2024

ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG (llama/9573)

673df39

slaren commited on Sep 21, 2024

Update CUDA graph on scale change plus clear nodes/params (llama/9550)

6b63eb1

agray3 commited on Sep 21, 2024

examples : adapt to ggml.h changes (ggml/0)

91c7734

ggerganov commited on Sep 20, 2024

ggml : refactoring (llama/#0)

1b62c96

ggerganov commited on Sep 20, 2024

ggml : fix builds (llama/0)

524a01b

ggerganov commited on Sep 20, 2024

ggml : fix trailing whitespace (llama/0)

214f95e

ggerganov commited on Sep 20, 2024

CUDA: fix sum.cu compilation for CUDA < 11.7 (llama/9562)

b305ecf

JohannesGaessler commited on Sep 20, 2024

ggml : fix n_threads_cur initialization with one thread (llama/9538)

af82b69

slaren Max Krasnyansky commited on Sep 18, 2024

threadpool : skip polling for unused threads (llama/9461)

9d11a7a

Max Krasnyansky commited on Sep 17, 2024

ggml : link MATH_LIBRARY not by its full path (llama/9339)

07d57ec

Michael Podvitskiy commited on Sep 16, 2024

cmake : do not hide GGML options + rename option (llama/9465)

8c32d36

ggerganov commited on Sep 16, 2024

ggml : IQ4_NL sgemm + Q4_0 AVX optimization (llama/9422)

f2986f6

Eve commited on Sep 16, 2024

metal : handle zero-sized allocs (llama/9466)

868283e

ggerganov commited on Sep 16, 2024

common : reimplement logging (llama/9418)

e893c97

ggerganov commited on Sep 15, 2024

Commit History

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974) 76aa810

vulkan : mul_mat: fix UB with small warps (ggml/952) d1a29c6

ggml : fix ggml_cast (ggml/973) c44d575

ggml: fix gradient allocation logic (ggml/966) ad3f29d

ggml : define missing HWCAP flags (llama/9684) 1d52105

ggml : add run-time detection of neon, i8mm and sve (llama/9331) 12c0e23

Enable use to the rebar feature to upload buffers to the device. (llama/9251) 760f8c2

mtgpu: enable VMM (llama/9597) e84b4f5

ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (llama/9217) 50395aa

cann: fix crash when llama-bench is running on multiple cann devices (llama/9627) 068c697

CUDA: remove bad assert (ggml/972) 91954a7

vulkan : multithread pipeline creation (ggml/963) ba60f98

vulkan : fix build for GGML_VULKAN_RUN_TESTS, add TFLOPS to log (ggml/961) 85e2387

vulkan : argsort barriers must be under uniform control flow (ggml/951) b2602d7

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969) ad34655

server : ffmpeg overwrite leftover temp file (#2431) 2dafb8e unverified

whisper : add large-v3-turbo (#2440) f3283ba unverified

tests : remove test-backend-ops (#2434) 050ba38 unverified

ci : disable failing CUDA and Java builds ecef312 unverified

readme : fix references to download-ggml-model.sh (#2427) 3d92452 unverified

make : remove "talk" target until updated 5fb8fce

ggml : add ggml-cpu-impl.h (skip) (#0) 958f2d3

sync : ggml e22e2f8

talk-llama : sync llama.cpp f91f98d

ggml : add AVX512DQ requirement for AVX512 builds (llama/9622) 14b5848

log : add CONT level for continuing previous log entry (llama/9610) a29a4c5

threads: fix msvc build without openmp (llama/9615) 97b3eb5

cuda: add q8_0->f32 cpy operation (llama/9571) 6201c74

threads: improve ggml_barrier scaling with large number of threads (llama/9598) aca04d5

ggml : AVX512 gemm for Q4_0_8_8 (llama/9532) 7349efc

metal : use F32 prec for K*Q in vec FA (llama/9595) 99c4239

Revert "[SYCL] fallback mmvq (ggml/9088)" (llama/9579) 5aceb3d

musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (llama/9526) 8ec75c3

Fix merge error in #9454 (llama/9589) 3142fa9

CUDA: enable Gemma FA for HIP/Pascal (llama/9581) 97cb7ce

RWKV v6: RWKV_WKV op CUDA implementation (llama/9454) 8d3e707

ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG (llama/9573) 673df39

Update CUDA graph on scale change plus clear nodes/params (llama/9550) 6b63eb1

examples : adapt to ggml.h changes (ggml/0) 91c7734

ggml : refactoring (llama/#0) 1b62c96

ggml : fix builds (llama/0) 524a01b

ggml : fix trailing whitespace (llama/0) 214f95e

CUDA: fix sum.cu compilation for CUDA < 11.7 (llama/9562) b305ecf

ggml : fix n_threads_cur initialization with one thread (llama/9538) af82b69

threadpool : skip polling for unused threads (llama/9461) 9d11a7a

ggml : link MATH_LIBRARY not by its full path (llama/9339) 07d57ec

cmake : do not hide GGML options + rename option (llama/9465) 8c32d36

ggml : IQ4_NL sgemm + Q4_0 AVX optimization (llama/9422) f2986f6

metal : handle zero-sized allocs (llama/9466) 868283e

common : reimplement logging (llama/9418) e893c97

test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)

76aa810

vulkan : mul_mat: fix UB with small warps (ggml/952)

d1a29c6

ggml : fix ggml_cast (ggml/973)

c44d575

ggml: fix gradient allocation logic (ggml/966)

ad3f29d

ggml : define missing HWCAP flags (llama/9684)

1d52105

ggml : add run-time detection of neon, i8mm and sve (llama/9331)

12c0e23

Enable use to the rebar feature to upload buffers to the device. (llama/9251)

760f8c2

mtgpu: enable VMM (llama/9597)

e84b4f5

ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (llama/9217)

50395aa

cann: fix crash when llama-bench is running on multiple cann devices (llama/9627)

068c697

CUDA: remove bad assert (ggml/972)

91954a7

vulkan : multithread pipeline creation (ggml/963)

ba60f98

vulkan : fix build for GGML_VULKAN_RUN_TESTS, add TFLOPS to log (ggml/961)

85e2387

vulkan : argsort barriers must be under uniform control flow (ggml/951)

b2602d7

ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969)

ad34655

server : ffmpeg overwrite leftover temp file (#2431)

2dafb8e
unverified

whisper : add large-v3-turbo (#2440)

f3283ba
unverified

tests : remove test-backend-ops (#2434)

050ba38
unverified

ci : disable failing CUDA and Java builds

ecef312
unverified

readme : fix references to download-ggml-model.sh (#2427)

3d92452
unverified

make : remove "talk" target until updated

5fb8fce

ggml : add ggml-cpu-impl.h (skip) (#0)

958f2d3

sync : ggml

e22e2f8

talk-llama : sync llama.cpp

f91f98d

ggml : add AVX512DQ requirement for AVX512 builds (llama/9622)

14b5848

log : add CONT level for continuing previous log entry (llama/9610)

a29a4c5

threads: fix msvc build without openmp (llama/9615)

97b3eb5

cuda: add q8_0->f32 cpy operation (llama/9571)

6201c74

threads: improve ggml_barrier scaling with large number of threads (llama/9598)

aca04d5

ggml : AVX512 gemm for Q4_0_8_8 (llama/9532)

7349efc

metal : use F32 prec for K*Q in vec FA (llama/9595)

99c4239

Revert "[SYCL] fallback mmvq (ggml/9088)" (llama/9579)

5aceb3d

musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (llama/9526)

8ec75c3

Fix merge error in #9454 (llama/9589)

3142fa9

CUDA: enable Gemma FA for HIP/Pascal (llama/9581)

97cb7ce

RWKV v6: RWKV_WKV op CUDA implementation (llama/9454)

8d3e707

ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG (llama/9573)

673df39

Update CUDA graph on scale change plus clear nodes/params (llama/9550)

6b63eb1

examples : adapt to ggml.h changes (ggml/0)

91c7734

ggml : refactoring (llama/#0)

1b62c96

ggml : fix builds (llama/0)

524a01b

ggml : fix trailing whitespace (llama/0)

214f95e

CUDA: fix sum.cu compilation for CUDA < 11.7 (llama/9562)

b305ecf

ggml : fix n_threads_cur initialization with one thread (llama/9538)

af82b69

threadpool : skip polling for unused threads (llama/9461)

9d11a7a

ggml : link MATH_LIBRARY not by its full path (llama/9339)

07d57ec

cmake : do not hide GGML options + rename option (llama/9465)

8c32d36

ggml : IQ4_NL sgemm + Q4_0 AVX optimization (llama/9422)

f2986f6

metal : handle zero-sized allocs (llama/9466)

868283e

common : reimplement logging (llama/9418)

e893c97