Commit History

cmake : make libwhisper.so position independent (#1792)
1cf1553
unverified

trixirt commited on

cmake : temporary remove VLA check (#1795)
1a32e6f
unverified

ggerganov commited on

whisper.android : return output from benchmarks (#1785)
5cff61b
unverified

lcfrs commited on

server : implement "verbose_json" format with token details (#1781)
d6e13b6
unverified

rmmh commited on

ggml : sync ggml-metal.m
b4085c3
unverified

ggerganov commited on

sync : llama.cpp
5de718a
unverified

ggerganov commited on

sync : ggml
34bdd70
unverified

ggerganov commited on

ggml : add IQ2 to test-backend-ops + refactoring (llama/4990)
227f2ae
unverified

ggerganov commited on

imatrix : offload to GPU support (llama/4957)
6490f98
unverified

ggerganov commited on

backend : add eval callback (llama/4935)
3cc64d6
unverified

ggerganov commited on

metal : create autorelease pool during library build (llama/4970)
9027276
unverified

ggerganov commited on

ggml : importance matrix support for legacy quants (llama/4969)
d8bb9d8
unverified

Kawrakow ikawrakow commited on

metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936)
e2cc0e5
unverified

azarovalex ggerganov commited on

ggml : introduce GGML_CALL function annotation (llama/4850)
7815f68
unverified

jartine commited on

cuda : fix dequantize kernel names (llama/4938)
95f6502
unverified

ggerganov commited on

CUDA: faster dequantize kernels for Q4_0 and Q4_1 (llama/4938)
73c6598
unverified

Kawrakow ikawrakow commited on

Add ability to use importance matrix for all k-quants (llama/4930)
7032309
unverified

Kawrakow ikawrakow commited on

talk-llama : optional wake-up command and audio confirmation (#1765)
542e8da
unverified

rakksor commited on

server : fix building and simplify lib deps on Windows (#1772)
f928f33
unverified

Przemysław Pawełczyk commited on

talk-llama : sync llama.cpp
62ad8e0
unverified

ggerganov commited on

talk-llama : llama.cpp
d128cb3
unverified

ggerganov commited on

sync : ggml
6a472b5
unverified

ggerganov commited on

metal : correctly set SIMD support flags on iOS (llama/4923)
1cf2fa9
unverified

azarovalex commited on

2-bit quantizations (llama/4897)
8a399ab
unverified

Kawrakow ikawrakow commited on

scripts : sync-ggml-am.sh add option to skip commits
c34dd82
unverified

ggerganov commited on

talk-llama : sync llama.cpp
b9d2bd9
unverified

ggerganov commited on

sync : ggml
18bfc83
unverified

ggerganov commited on

examples : adapt to metal API
b65decb
unverified

ggerganov commited on

ggml: cache sin/cos for RoPE (llama/4908)
c315fbf
unverified

JohannesGaessler commited on

metal : remove old API (llama/4919)
d6abb6a
unverified

ggerganov commited on

metal : disable log for loaded kernels (llama/4794)
2305485
unverified

ggerganov commited on

gguf : fix potential infinite for-loop (llama/4600)
0e93179
unverified

texmex76 Bernhard Gstrein commited on

metal : refactor kernel loading code (llama/4794)
53e6bf8
unverified

ggerganov commited on

CUDA: faster q8_0 -> f16 dequantization (llama/4895)
0a1a178
unverified

JohannesGaessler commited on

talk-llama : add optional CLI arg to set the bot name (#1764)
63c8089
unverified

RhinoDevel commited on

examples : add python example for transcription (#1744)
d600e4c
unverified

contractorwolf commited on

whisper : load the model into multiple buffers of max size 1GB (#1763)
0e9101f
unverified

ggerganov commited on

talk-llama : sync llama.cpp
75c5f9c
unverified

ggerganov commited on

sync : ggml
2ed0a44
unverified

ggerganov commited on

backend_sched : fix assignments
cb91db5
unverified

slaren commited on

CUDA: fix softmax compile for old CUDA versions (llama/4862)
5eda533
unverified

JohannesGaessler commited on

Importance Matrix calculation (llama/4861)
c0b17f1
unverified

Kawrakow ikawrakow ggerganov commited on

models : make all scripts to be POSIX Compliant (#1725)
f7aef3e
unverified

sonphantrung commited on

ggml : fix 32-bit ARM compat for IQ2_XS (#1758)
d5836c9
unverified

ggerganov commited on

go : add SetInitialPrompt method to bindings (#1753)
5fd6678
unverified

blib321 commited on

server : add more parameters to server api (#1754)
cb0cf7b
unverified

George Hindle commited on

whisper : fix segment length with params.no_timestamps == true
720d738
unverified

ggerganov commited on

params : don't compute timestamps when not printing them (#1755)
251825e
unverified

George Hindle commited on

talk-llama : sync llama.cpp
f33490f
unverified

ggerganov commited on