whisper.cpp / ggml-cuda /template-instances /fattn-vec-f16-instance-hs128-q5_1-f16.cu

Commit History

CUDA: refactor mmq, dmmv, mmvq (llama/7716)
849ff52

JohannesGaessler commited on

CUDA: quantized KV support for FA vec (llama/7527)
315df8c

JohannesGaessler commited on