Commits · Xenobd/whisper.cpp

cmake : make libwhisper.so position independent (#1792)

1cf1553
unverified

trixirt commited on Jan 22, 2024

cmake : temporary remove VLA check (#1795)

1a32e6f
unverified

ggerganov commited on Jan 22, 2024

whisper.android : return output from benchmarks (#1785)

5cff61b
unverified

lcfrs commited on Jan 19, 2024

server : implement "verbose_json" format with token details (#1781)

d6e13b6
unverified

rmmh commited on Jan 18, 2024

ggml : sync ggml-metal.m

b4085c3
unverified

ggerganov commited on Jan 18, 2024

sync : llama.cpp

5de718a
unverified

ggerganov commited on Jan 17, 2024

sync : ggml

34bdd70
unverified

ggerganov commited on Jan 17, 2024

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990)

227f2ae
unverified

ggerganov commited on Jan 17, 2024

imatrix : offload to GPU support (llama/4957)

6490f98
unverified

ggerganov commited on Jan 17, 2024

backend : add eval callback (llama/4935)

3cc64d6
unverified

ggerganov commited on Jan 17, 2024

metal : create autorelease pool during library build (llama/4970)

9027276
unverified

ggerganov commited on Jan 17, 2024

ggml : importance matrix support for legacy quants (llama/4969)

d8bb9d8
unverified

Kawrakow

ikawrakow commited on Jan 16, 2024

metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936)

e2cc0e5
unverified

azarovalex

ggerganov commited on Jan 16, 2024

ggml : introduce GGML_CALL function annotation (llama/4850)

7815f68
unverified

jartine commited on Jan 16, 2024

cuda : fix dequantize kernel names (llama/4938)

95f6502
unverified

ggerganov commited on Jan 15, 2024

CUDA: faster dequantize kernels for Q4_0 and Q4_1 (llama/4938)

73c6598
unverified

Kawrakow

ikawrakow commited on Jan 15, 2024

Add ability to use importance matrix for all k-quants (llama/4930)

7032309
unverified

Kawrakow

ikawrakow commited on Jan 14, 2024

talk-llama : optional wake-up command and audio confirmation (#1765)

542e8da
unverified

rakksor commited on Jan 16, 2024

server : fix building and simplify lib deps on Windows (#1772)

f928f33
unverified

Przemysław Pawełczyk commited on Jan 15, 2024

talk-llama : sync llama.cpp

62ad8e0
unverified

ggerganov commited on Jan 14, 2024

talk-llama : llama.cpp

d128cb3
unverified

ggerganov commited on Jan 14, 2024

sync : ggml

6a472b5
unverified

ggerganov commited on Jan 14, 2024

metal : correctly set SIMD support flags on iOS (llama/4923)

1cf2fa9
unverified

azarovalex commited on Jan 14, 2024

2-bit quantizations (llama/4897)

8a399ab
unverified

Kawrakow

ikawrakow commited on Jan 14, 2024

scripts : sync-ggml-am.sh add option to skip commits

c34dd82
unverified

ggerganov commited on Jan 14, 2024

talk-llama : sync llama.cpp

b9d2bd9
unverified

ggerganov commited on Jan 13, 2024

sync : ggml

18bfc83
unverified

ggerganov commited on Jan 13, 2024

examples : adapt to metal API

b65decb
unverified

ggerganov commited on Jan 13, 2024

ggml: cache sin/cos for RoPE (llama/4908)

c315fbf
unverified

JohannesGaessler commited on Jan 13, 2024

metal : remove old API (llama/4919)

d6abb6a
unverified

ggerganov commited on Jan 13, 2024

metal : disable log for loaded kernels (llama/4794)

2305485
unverified

ggerganov commited on Jan 13, 2024

gguf : fix potential infinite for-loop (llama/4600)

0e93179
unverified

texmex76 Bernhard Gstrein commited on Jan 13, 2024

metal : refactor kernel loading code (llama/4794)

53e6bf8
unverified

ggerganov commited on Jan 13, 2024

CUDA: faster q8_0 -> f16 dequantization (llama/4895)

0a1a178
unverified

JohannesGaessler commited on Jan 12, 2024

talk-llama : add optional CLI arg to set the bot name (#1764)

63c8089
unverified

RhinoDevel commited on Jan 13, 2024

examples : add python example for transcription (#1744)

d600e4c
unverified

contractorwolf commited on Jan 13, 2024

whisper : load the model into multiple buffers of max size 1GB (#1763)

0e9101f
unverified

ggerganov commited on Jan 13, 2024

talk-llama : sync llama.cpp

75c5f9c
unverified

ggerganov commited on Jan 12, 2024

sync : ggml

2ed0a44
unverified

ggerganov commited on Jan 12, 2024

backend_sched : fix assignments

cb91db5
unverified

slaren commited on Jan 12, 2024

llama : ggml-backend integration (llama/4766)

362430b
unverified

slaren

ggerganov

JohannesGaessler commited on Jan 12, 2024

CUDA: fix softmax compile for old CUDA versions (llama/4862)

5eda533
unverified

JohannesGaessler commited on Jan 12, 2024

Importance Matrix calculation (llama/4861)

c0b17f1
unverified

Kawrakow

ikawrakow

ggerganov commited on Jan 12, 2024

models : make all scripts to be POSIX Compliant (#1725)

f7aef3e
unverified

sonphantrung commited on Jan 12, 2024

ggml : fix 32-bit ARM compat for IQ2_XS (#1758)

d5836c9
unverified

ggerganov commited on Jan 12, 2024

go : add SetInitialPrompt method to bindings (#1753)

5fd6678
unverified

blib321 commited on Jan 12, 2024

server : add more parameters to server api (#1754)

cb0cf7b
unverified

George Hindle commited on Jan 12, 2024

whisper : fix segment length with params.no_timestamps == true

720d738
unverified

ggerganov commited on Jan 12, 2024

params : don't compute timestamps when not printing them (#1755)

251825e
unverified

George Hindle commited on Jan 12, 2024

talk-llama : sync llama.cpp

f33490f
unverified

ggerganov commited on Jan 11, 2024

Commit History

cmake : make libwhisper.so position independent (#1792) 1cf1553 unverified

cmake : temporary remove VLA check (#1795) 1a32e6f unverified

whisper.android : return output from benchmarks (#1785) 5cff61b unverified

server : implement "verbose_json" format with token details (#1781) d6e13b6 unverified

ggml : sync ggml-metal.m b4085c3 unverified

sync : llama.cpp 5de718a unverified

sync : ggml 34bdd70 unverified

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990) 227f2ae unverified

imatrix : offload to GPU support (llama/4957) 6490f98 unverified

backend : add eval callback (llama/4935) 3cc64d6 unverified

metal : create autorelease pool during library build (llama/4970) 9027276 unverified

ggml : importance matrix support for legacy quants (llama/4969) d8bb9d8 unverified

metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936) e2cc0e5 unverified

ggml : introduce GGML_CALL function annotation (llama/4850) 7815f68 unverified

cuda : fix dequantize kernel names (llama/4938) 95f6502 unverified

CUDA: faster dequantize kernels for Q4_0 and Q4_1 (llama/4938) 73c6598 unverified

Add ability to use importance matrix for all k-quants (llama/4930) 7032309 unverified

talk-llama : optional wake-up command and audio confirmation (#1765) 542e8da unverified

server : fix building and simplify lib deps on Windows (#1772) f928f33 unverified

talk-llama : sync llama.cpp 62ad8e0 unverified

talk-llama : llama.cpp d128cb3 unverified

sync : ggml 6a472b5 unverified

metal : correctly set SIMD support flags on iOS (llama/4923) 1cf2fa9 unverified

2-bit quantizations (llama/4897) 8a399ab unverified

scripts : sync-ggml-am.sh add option to skip commits c34dd82 unverified

talk-llama : sync llama.cpp b9d2bd9 unverified

sync : ggml 18bfc83 unverified

examples : adapt to metal API b65decb unverified

ggml: cache sin/cos for RoPE (llama/4908) c315fbf unverified

metal : remove old API (llama/4919) d6abb6a unverified

metal : disable log for loaded kernels (llama/4794) 2305485 unverified

gguf : fix potential infinite for-loop (llama/4600) 0e93179 unverified

metal : refactor kernel loading code (llama/4794) 53e6bf8 unverified

CUDA: faster q8_0 -> f16 dequantization (llama/4895) 0a1a178 unverified

talk-llama : add optional CLI arg to set the bot name (#1764) 63c8089 unverified

examples : add python example for transcription (#1744) d600e4c unverified

whisper : load the model into multiple buffers of max size 1GB (#1763) 0e9101f unverified

talk-llama : sync llama.cpp 75c5f9c unverified

sync : ggml 2ed0a44 unverified

backend_sched : fix assignments cb91db5 unverified

llama : ggml-backend integration (llama/4766) 362430b unverified

CUDA: fix softmax compile for old CUDA versions (llama/4862) 5eda533 unverified

Importance Matrix calculation (llama/4861) c0b17f1 unverified

models : make all scripts to be POSIX Compliant (#1725) f7aef3e unverified

ggml : fix 32-bit ARM compat for IQ2_XS (#1758) d5836c9 unverified

go : add SetInitialPrompt method to bindings (#1753) 5fd6678 unverified

server : add more parameters to server api (#1754) cb0cf7b unverified

whisper : fix segment length with params.no_timestamps == true 720d738 unverified

params : don't compute timestamps when not printing them (#1755) 251825e unverified

talk-llama : sync llama.cpp f33490f unverified

cmake : make libwhisper.so position independent (#1792)

1cf1553
unverified

cmake : temporary remove VLA check (#1795)

1a32e6f
unverified

whisper.android : return output from benchmarks (#1785)

5cff61b
unverified

server : implement "verbose_json" format with token details (#1781)

d6e13b6
unverified

ggml : sync ggml-metal.m

b4085c3
unverified

sync : llama.cpp

5de718a
unverified

sync : ggml

34bdd70
unverified

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990)

227f2ae
unverified

imatrix : offload to GPU support (llama/4957)

6490f98
unverified

backend : add eval callback (llama/4935)

3cc64d6
unverified

metal : create autorelease pool during library build (llama/4970)

9027276
unverified

ggml : importance matrix support for legacy quants (llama/4969)

d8bb9d8
unverified

metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936)

e2cc0e5
unverified

ggml : introduce GGML_CALL function annotation (llama/4850)

7815f68
unverified

cuda : fix dequantize kernel names (llama/4938)

95f6502
unverified

CUDA: faster dequantize kernels for Q4_0 and Q4_1 (llama/4938)

73c6598
unverified

Add ability to use importance matrix for all k-quants (llama/4930)

7032309
unverified

talk-llama : optional wake-up command and audio confirmation (#1765)

542e8da
unverified

server : fix building and simplify lib deps on Windows (#1772)

f928f33
unverified

talk-llama : sync llama.cpp

62ad8e0
unverified

talk-llama : llama.cpp

d128cb3
unverified

sync : ggml

6a472b5
unverified

metal : correctly set SIMD support flags on iOS (llama/4923)

1cf2fa9
unverified

2-bit quantizations (llama/4897)

8a399ab
unverified

scripts : sync-ggml-am.sh add option to skip commits

c34dd82
unverified

talk-llama : sync llama.cpp

b9d2bd9
unverified

sync : ggml

18bfc83
unverified

examples : adapt to metal API

b65decb
unverified

ggml: cache sin/cos for RoPE (llama/4908)

c315fbf
unverified

metal : remove old API (llama/4919)

d6abb6a
unverified

metal : disable log for loaded kernels (llama/4794)

2305485
unverified

gguf : fix potential infinite for-loop (llama/4600)

0e93179
unverified

metal : refactor kernel loading code (llama/4794)

53e6bf8
unverified

CUDA: faster q8_0 -> f16 dequantization (llama/4895)

0a1a178
unverified

talk-llama : add optional CLI arg to set the bot name (#1764)

63c8089
unverified

examples : add python example for transcription (#1744)

d600e4c
unverified

whisper : load the model into multiple buffers of max size 1GB (#1763)

0e9101f
unverified

talk-llama : sync llama.cpp

75c5f9c
unverified

sync : ggml

2ed0a44
unverified

backend_sched : fix assignments

cb91db5
unverified

llama : ggml-backend integration (llama/4766)

362430b
unverified

CUDA: fix softmax compile for old CUDA versions (llama/4862)

5eda533
unverified

Importance Matrix calculation (llama/4861)

c0b17f1
unverified

models : make all scripts to be POSIX Compliant (#1725)

f7aef3e
unverified

ggml : fix 32-bit ARM compat for IQ2_XS (#1758)

d5836c9
unverified

go : add SetInitialPrompt method to bindings (#1753)

5fd6678
unverified

server : add more parameters to server api (#1754)

cb0cf7b
unverified

whisper : fix segment length with params.no_timestamps == true

720d738
unverified

params : don't compute timestamps when not printing them (#1755)

251825e
unverified

talk-llama : sync llama.cpp

f33490f
unverified