Commits · Xenobd/whisper.cpp

bindings/go : add linker flags to make metal work (#1944)

3dee0de
unverified

josharian commited on Mar 9, 2024

whisper : make beam candidate sort more stable (#1943)

1316242
unverified

josharian commited on Mar 9, 2024

ggml : try fix 32-bit arm compat (#1938)

6ea3354
unverified

ggerganov HF Staff commited on Mar 8, 2024

talk-llama : use llama_decode instead of llama_eval

301b000
unverified

ggerganov HF Staff commited on Mar 8, 2024

talk-llama : sync llama.cpp

fe602cb
unverified

ggerganov HF Staff commited on Mar 8, 2024

talk-llama : sync llama.cpp

802496b
unverified

ggerganov HF Staff commited on Mar 8, 2024

sync : ggml

bad33a1
unverified

ggerganov HF Staff commited on Mar 8, 2024

Revert "[SYCL] fix error when set main gpu to non-zero (llama/5901)" (llama/5918)

d7e8525
unverified

Neo Zhang Jianyu commited on Mar 7, 2024

fix error when set main gpu to non-zero (llama/5901)

829c347
unverified

Neo Zhang Jianyu commited on Mar 7, 2024

ggml : use SYS_get_cpu if SYS_getcpu is not defined (llama/5906)

909dbdc
unverified

Cebtenzzre commited on Mar 6, 2024

ggml : use `uint8x16_t` return type for `ggml_vqtbl1q_u8` (llama/5894)

7f5bc53
unverified

bobqianic commited on Mar 6, 2024

add wait() to make code stable (llama/5895)

41c3c12
unverified

Neo Zhang Jianyu commited on Mar 6, 2024

quants : use MM256_SET_M128I consistently to fix gcc 7 build (llama/5889)

cdd783a
unverified

Cebtenzzre commited on Mar 5, 2024

Vulkan Improvements (llama/5835)

ea2da45
unverified

OccamRazor commited on Mar 5, 2024

fix mul_mat fault in CI/unit-test (llama/5862)

91bb65e
unverified

Neo Zhang Jianyu

jinliangtao compilade

Cebtenzzre Xuan Son Nguyen

ggerganov HF Staff Kawrakow

ikawrakow

Cebtenzzre Michael Podvitskiy

phymbert github-actions[bot] Nindaleth Black_Fox

iamlemec slaren

dranger003

leejet Minsoo Cheong Dane Madsen hutli

emozilla commited on Mar 5, 2024

ggml : fix unknown status (llama/0)

394e5d8
unverified

ggerganov HF Staff commited on Mar 4, 2024

whisper : fix compute helper return (ggml/750)

b60b7f7
unverified

ggerganov HF Staff commited on Mar 5, 2024

ggml : introduce ggml_status (ggml/750)

151c676
unverified

Michael Podvitskiy slaren

ggerganov HF Staff commited on Mar 4, 2024

cuda : fix data race in soft max (llama/5853)

d1b60e4
unverified

slaren commited on Mar 3, 2024

ggml : fix IQ3_S AVX implementation (llama/5834)

98e5c63
unverified

ggerganov HF Staff commited on Mar 2, 2024

ggml : IQ3_S improvements (llama/5829)

06a8e30
unverified

Kawrakow

ikawrakow commited on Mar 2, 2024

Support multiple GPUs (split mode) on SYCL backend (llama/5806)

b1865d2
unverified

Neo Zhang Jianyu commited on Mar 2, 2024

ggml-vulkan: fix VULKAN_CHECK_RESULTS flag, which was previously broken (llama/5813)

472195f
unverified

ddpasa commited on Mar 1, 2024

Use batched mul_mat pathway (llama/5591)

4a30367
unverified

AidanBeltonS Abhilash Majumder commited on Mar 1, 2024

make portability_enumeration_ext apple only (llama/5757)

c164918
unverified

Eve commited on Feb 28, 2024

add some new ops, fix some operators and add batch operations to certain operators. (ggml/747)

dd8e3f9
unverified

leejet

ggerganov HF Staff slaren commited on Mar 3, 2024

examples : Auto lowercase language parameter in main.cpp (#1928)

98b861a
unverified

F1L1P bobqianic commited on Mar 6, 2024

examples : fix typo in bench.cpp (#1933)

8efe1fd
unverified

zhouwg commited on Mar 6, 2024

whisper : fix typo (#1925)

a0acef0
unverified

zhouwg commited on Mar 5, 2024

whisper.android.java : fix returns in JNI (#1929)

c1b258d
unverified

zhouwg commited on Mar 5, 2024

cmake : add library versioning (#1352)

3ab7ee7
unverified

kennethge

ggerganov HF Staff commited on Mar 4, 2024

readme : recommend MacOS Sonoma for Core ML (#1917)

c4e849b
unverified

Gavin Cai commited on Mar 4, 2024

talk-llama : sync llama.cpp

06c222c
unverified

ggerganov HF Staff commited on Feb 28, 2024

sync : ggml

b85f30e
unverified

ggerganov HF Staff commited on Feb 28, 2024

sync : llama.cpp (ggml/0)

8ea3a45
unverified

ggerganov HF Staff commited on Feb 28, 2024

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760)

9a07f42
unverified

Kawrakow

ikawrakow commited on Feb 28, 2024

Attempt to fix android build (llama/5752)

e720b3b
unverified

Kawrakow

ikawrakow commited on Feb 27, 2024

IQ4_XS: a 4.25 bpw quantization (llama/5747)

0ee1bfb
unverified

Kawrakow

ikawrakow commited on Feb 27, 2024

cuda : replace remaining shfl_xor with calls to warp_reduce functions (llama/5744)

753b30d
unverified

Engininja2 commited on Feb 27, 2024

ggml-quants : fix avx2 iq1_s vec_dot when compiled with gcc (llama/5742)

72e8610
unverified

Engininja2 commited on Feb 27, 2024

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)

2b9bb9e
unverified

Kawrakow

ikawrakow

ggerganov HF Staff commited on Feb 26, 2024

CUDA: fix DEBUG_CUDA_MALLOC (llama/5729)

f18f386
unverified

JohannesGaessler commited on Feb 26, 2024

Add support for soft_max ALiBi (llama/5639)

86d6a5e
unverified

AidanBeltonS Abhilash Majumder commited on Feb 26, 2024

ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (llama/5711)

430efc6
unverified

Crad commited on Feb 25, 2024

add google magika inference example (ggml/748)

10ac4bb
unverified

slaren commited on Feb 25, 2024

stream.wasm : fix invalid memory access when no segments (#1902)

3273767
unverified

Andrew S commited on Feb 26, 2024

talk-llama : sync llama.cpp

b92d757
unverified

ggerganov HF Staff commited on Feb 25, 2024

sync : ggml

3eb6cbf
unverified

ggerganov HF Staff commited on Feb 25, 2024

sync : llama.cpp (ggml/0)

6c535a8
unverified

ggerganov HF Staff commited on Feb 25, 2024

code : normalize enum names (llama/5697)

93e0830
unverified

ggerganov HF Staff commited on Feb 25, 2024

Commit History

bindings/go : add linker flags to make metal work (#1944) 3dee0de unverified

whisper : make beam candidate sort more stable (#1943) 1316242 unverified

ggml : try fix 32-bit arm compat (#1938) 6ea3354 unverified

talk-llama : use llama_decode instead of llama_eval 301b000 unverified

talk-llama : sync llama.cpp fe602cb unverified

talk-llama : sync llama.cpp 802496b unverified

sync : ggml bad33a1 unverified

Revert "[SYCL] fix error when set main gpu to non-zero (llama/5901)" (llama/5918) d7e8525 unverified

fix error when set main gpu to non-zero (llama/5901) 829c347 unverified

ggml : use SYS_get_cpu if SYS_getcpu is not defined (llama/5906) 909dbdc unverified

ggml : use `uint8x16_t` return type for `ggml_vqtbl1q_u8` (llama/5894) 7f5bc53 unverified

add wait() to make code stable (llama/5895) 41c3c12 unverified

quants : use MM256_SET_M128I consistently to fix gcc 7 build (llama/5889) cdd783a unverified

Vulkan Improvements (llama/5835) ea2da45 unverified

fix mul_mat fault in CI/unit-test (llama/5862) 91bb65e unverified

ggml : fix unknown status (llama/0) 394e5d8 unverified

whisper : fix compute helper return (ggml/750) b60b7f7 unverified

ggml : introduce ggml_status (ggml/750) 151c676 unverified

cuda : fix data race in soft max (llama/5853) d1b60e4 unverified

ggml : fix IQ3_S AVX implementation (llama/5834) 98e5c63 unverified

ggml : IQ3_S improvements (llama/5829) 06a8e30 unverified

Support multiple GPUs (split mode) on SYCL backend (llama/5806) b1865d2 unverified

ggml-vulkan: fix VULKAN_CHECK_RESULTS flag, which was previously broken (llama/5813) 472195f unverified

Use batched mul_mat pathway (llama/5591) 4a30367 unverified

make portability_enumeration_ext apple only (llama/5757) c164918 unverified

add some new ops, fix some operators and add batch operations to certain operators. (ggml/747) dd8e3f9 unverified

examples : Auto lowercase language parameter in main.cpp (#1928) 98b861a unverified

examples : fix typo in bench.cpp (#1933) 8efe1fd unverified

whisper : fix typo (#1925) a0acef0 unverified

whisper.android.java : fix returns in JNI (#1929) c1b258d unverified

cmake : add library versioning (#1352) 3ab7ee7 unverified

readme : recommend MacOS Sonoma for Core ML (#1917) c4e849b unverified

talk-llama : sync llama.cpp 06c222c unverified

sync : ggml b85f30e unverified

sync : llama.cpp (ggml/0) 8ea3a45 unverified

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760) 9a07f42 unverified

Attempt to fix android build (llama/5752) e720b3b unverified

IQ4_XS: a 4.25 bpw quantization (llama/5747) 0ee1bfb unverified

cuda : replace remaining shfl_xor with calls to warp_reduce functions (llama/5744) 753b30d unverified

ggml-quants : fix avx2 iq1_s vec_dot when compiled with gcc (llama/5742) 72e8610 unverified

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721) 2b9bb9e unverified

CUDA: fix DEBUG_CUDA_MALLOC (llama/5729) f18f386 unverified

Add support for soft_max ALiBi (llama/5639) 86d6a5e unverified

ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (llama/5711) 430efc6 unverified

add google magika inference example (ggml/748) 10ac4bb unverified

stream.wasm : fix invalid memory access when no segments (#1902) 3273767 unverified

talk-llama : sync llama.cpp b92d757 unverified

sync : ggml 3eb6cbf unverified

sync : llama.cpp (ggml/0) 6c535a8 unverified

code : normalize enum names (llama/5697) 93e0830 unverified

bindings/go : add linker flags to make metal work (#1944)

3dee0de
unverified

whisper : make beam candidate sort more stable (#1943)

1316242
unverified

ggml : try fix 32-bit arm compat (#1938)

6ea3354
unverified

talk-llama : use llama_decode instead of llama_eval

301b000
unverified

talk-llama : sync llama.cpp

fe602cb
unverified

talk-llama : sync llama.cpp

802496b
unverified

sync : ggml

bad33a1
unverified

Revert "[SYCL] fix error when set main gpu to non-zero (llama/5901)" (llama/5918)

d7e8525
unverified

fix error when set main gpu to non-zero (llama/5901)

829c347
unverified

ggml : use SYS_get_cpu if SYS_getcpu is not defined (llama/5906)

909dbdc
unverified

ggml : use `uint8x16_t` return type for `ggml_vqtbl1q_u8` (llama/5894)

7f5bc53
unverified

add wait() to make code stable (llama/5895)

41c3c12
unverified

quants : use MM256_SET_M128I consistently to fix gcc 7 build (llama/5889)

cdd783a
unverified

Vulkan Improvements (llama/5835)

ea2da45
unverified

fix mul_mat fault in CI/unit-test (llama/5862)

91bb65e
unverified

ggml : fix unknown status (llama/0)

394e5d8
unverified

whisper : fix compute helper return (ggml/750)

b60b7f7
unverified

ggml : introduce ggml_status (ggml/750)

151c676
unverified

cuda : fix data race in soft max (llama/5853)

d1b60e4
unverified

ggml : fix IQ3_S AVX implementation (llama/5834)

98e5c63
unverified

ggml : IQ3_S improvements (llama/5829)

06a8e30
unverified

Support multiple GPUs (split mode) on SYCL backend (llama/5806)

b1865d2
unverified

ggml-vulkan: fix VULKAN_CHECK_RESULTS flag, which was previously broken (llama/5813)

472195f
unverified

Use batched mul_mat pathway (llama/5591)

4a30367
unverified

make portability_enumeration_ext apple only (llama/5757)

c164918
unverified

add some new ops, fix some operators and add batch operations to certain operators. (ggml/747)

dd8e3f9
unverified

examples : Auto lowercase language parameter in main.cpp (#1928)

98b861a
unverified

examples : fix typo in bench.cpp (#1933)

8efe1fd
unverified

whisper : fix typo (#1925)

a0acef0
unverified

whisper.android.java : fix returns in JNI (#1929)

c1b258d
unverified

cmake : add library versioning (#1352)

3ab7ee7
unverified

readme : recommend MacOS Sonoma for Core ML (#1917)

c4e849b
unverified

talk-llama : sync llama.cpp

06c222c
unverified

sync : ggml

b85f30e
unverified

sync : llama.cpp (ggml/0)

8ea3a45
unverified

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760)

9a07f42
unverified

Attempt to fix android build (llama/5752)

e720b3b
unverified

IQ4_XS: a 4.25 bpw quantization (llama/5747)

0ee1bfb
unverified

cuda : replace remaining shfl_xor with calls to warp_reduce functions (llama/5744)

753b30d
unverified

ggml-quants : fix avx2 iq1_s vec_dot when compiled with gcc (llama/5742)

72e8610
unverified

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)

2b9bb9e
unverified

CUDA: fix DEBUG_CUDA_MALLOC (llama/5729)

f18f386
unverified

Add support for soft_max ALiBi (llama/5639)

86d6a5e
unverified

ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (llama/5711)

430efc6
unverified

add google magika inference example (ggml/748)

10ac4bb
unverified

stream.wasm : fix invalid memory access when no segments (#1902)

3273767
unverified

talk-llama : sync llama.cpp

b92d757
unverified

sync : ggml

3eb6cbf
unverified

sync : llama.cpp (ggml/0)

6c535a8
unverified

code : normalize enum names (llama/5697)

93e0830
unverified