Which component has the problem?
CUTLASS C++
Bug Report
Describe the bug
At cutlass/gemm/collective/sm100_blockscaled_mma_mixed_tma_cpasync_warpspecialized.hpp:L1029-1032, s2t copy is performed by both SMs when 2SM mode is enabled.
Steps/Code to reproduce bug
This itself seems to work correctly, but has a potential race condition bug.
Expected behavior
The stub should be protected by a branch that checks is_mma_lead_cta i.e.
if (is_mma_leader_cta && cute::elect_one_sync()) {
copy(tiled_copy_s2t_SFA, thr_tCsSFA_s2t(_,_,_,_,read_stage_tma), thr_tCtSFA_s2t);
copy(tiled_copy_s2t_SFB, thr_tCsSFB_s2t(_,_,_,_,read_stage_tma), thr_tCtSFB_s2t);
}
Environment details (please complete the following information):
- Environment location: B200
Which component has the problem?
CUTLASS C++
Bug Report
Describe the bug
At
cutlass/gemm/collective/sm100_blockscaled_mma_mixed_tma_cpasync_warpspecialized.hpp:L1029-1032, s2t copy is performed by both SMs when 2SM mode is enabled.Steps/Code to reproduce bug
This itself seems to work correctly, but has a potential race condition bug.
Expected behavior
The stub should be protected by a branch that checks
is_mma_lead_ctai.e.Environment details (please complete the following information):