Skip to content

Fix packed-QKV and broadcast-head bias strides in quantized GQA flash attention#28963

Open
tianleiwu wants to merge 4 commits into
mainfrom
tlwu/fix_gqa_quantized_kv
Open

Fix packed-QKV and broadcast-head bias strides in quantized GQA flash attention#28963
tianleiwu wants to merge 4 commits into
mainfrom
tlwu/fix_gqa_quantized_kv

Make q_batch_stride consistent with offset in per-batch quantized GQA…

712ef2b
Select commit
Loading
Failed to load commit list.
Azure Pipelines / Linux Android Emulator QNN CI Pipeline succeeded Jun 19, 2026 in 13m 36s

Build #20260619.16 succeeded