whisper.cpp / ggml-cuda /fattn-common.cuh

Commit History

CUDA: deduplicate FlashAttention code (llama/7352)
65ab3e8

JohannesGaessler commited on

CUDA: add FP32 FlashAttention vector kernel (llama/7188)
03d4b22
unverified

JohannesGaessler commited on