cmake : make libwhisper.so position independent (#1792) 1cf1553 unverified trixirt commited on Jan 22, 2024
whisper.android : return output from benchmarks (#1785) 5cff61b unverified lcfrs commited on Jan 19, 2024
server : implement "verbose_json" format with token details (#1781) d6e13b6 unverified rmmh commited on Jan 18, 2024
ggml : add IQ2 to test-backend-ops + refactoring (llama/4990) 227f2ae unverified ggerganov commited on Jan 17, 2024
metal : create autorelease pool during library build (llama/4970) 9027276 unverified ggerganov commited on Jan 17, 2024
ggml : importance matrix support for legacy quants (llama/4969) d8bb9d8 unverified Kawrakow ikawrakow commited on Jan 16, 2024
metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936) e2cc0e5 unverified azarovalex ggerganov commited on Jan 16, 2024
ggml : introduce GGML_CALL function annotation (llama/4850) 7815f68 unverified jartine commited on Jan 16, 2024
cuda : fix dequantize kernel names (llama/4938) 95f6502 unverified ggerganov commited on Jan 15, 2024
CUDA: faster dequantize kernels for Q4_0 and Q4_1 (llama/4938) 73c6598 unverified Kawrakow ikawrakow commited on Jan 15, 2024
Add ability to use importance matrix for all k-quants (llama/4930) 7032309 unverified Kawrakow ikawrakow commited on Jan 14, 2024
talk-llama : optional wake-up command and audio confirmation (#1765) 542e8da unverified rakksor commited on Jan 16, 2024
server : fix building and simplify lib deps on Windows (#1772) f928f33 unverified Przemysław Pawełczyk commited on Jan 15, 2024
metal : correctly set SIMD support flags on iOS (llama/4923) 1cf2fa9 unverified azarovalex commited on Jan 14, 2024
scripts : sync-ggml-am.sh add option to skip commits c34dd82 unverified ggerganov commited on Jan 14, 2024
ggml: cache sin/cos for RoPE (llama/4908) c315fbf unverified JohannesGaessler commited on Jan 13, 2024
metal : disable log for loaded kernels (llama/4794) 2305485 unverified ggerganov commited on Jan 13, 2024
gguf : fix potential infinite for-loop (llama/4600) 0e93179 unverified texmex76 Bernhard Gstrein commited on Jan 13, 2024
metal : refactor kernel loading code (llama/4794) 53e6bf8 unverified ggerganov commited on Jan 13, 2024
CUDA: faster q8_0 -> f16 dequantization (llama/4895) 0a1a178 unverified JohannesGaessler commited on Jan 12, 2024
talk-llama : add optional CLI arg to set the bot name (#1764) 63c8089 unverified RhinoDevel commited on Jan 13, 2024
examples : add python example for transcription (#1744) d600e4c unverified contractorwolf commited on Jan 13, 2024
whisper : load the model into multiple buffers of max size 1GB (#1763) 0e9101f unverified ggerganov commited on Jan 13, 2024
llama : ggml-backend integration (llama/4766) 362430b unverified slaren ggerganov JohannesGaessler commited on Jan 12, 2024
CUDA: fix softmax compile for old CUDA versions (llama/4862) 5eda533 unverified JohannesGaessler commited on Jan 12, 2024
Importance Matrix calculation (llama/4861) c0b17f1 unverified Kawrakow ikawrakow ggerganov commited on Jan 12, 2024
models : make all scripts to be POSIX Compliant (#1725) f7aef3e unverified sonphantrung commited on Jan 12, 2024
ggml : fix 32-bit ARM compat for IQ2_XS (#1758) d5836c9 unverified ggerganov commited on Jan 12, 2024
go : add SetInitialPrompt method to bindings (#1753) 5fd6678 unverified blib321 commited on Jan 12, 2024
server : add more parameters to server api (#1754) cb0cf7b unverified George Hindle commited on Jan 12, 2024
whisper : fix segment length with params.no_timestamps == true 720d738 unverified ggerganov commited on Jan 12, 2024
params : don't compute timestamps when not printing them (#1755) 251825e unverified George Hindle commited on Jan 12, 2024