Stale comment in `examples/cute/tutorial/hopper/wgmma_sm90.cu`

The comment for the `gemm_nt`'s tiled copyA/B says the thread layout used is `32x4`, but the used thread layout is `16x8`.

```
  // Define the thread layouts (static)
  TiledCopy copyA = make_tiled_copy(Copy_Atom<SM80_CP_ASYNC_CACHEALWAYS<uint128_t>, TA>{},
                                    Layout<Shape<_16,_8>>{}, // Thr layout 32x4 m-major
                                    Layout<Shape< _8,_1>>{});// Val layout  8x1 m-major
  TiledCopy copyB = make_tiled_copy(Copy_Atom<SM80_CP_ASYNC_CACHEALWAYS<uint128_t>, TB>{},
                                    Layout<Shape<_16,_8>>{}, // Thr layout 32x4 n-major
                                    Layout<Shape< _8,_1>>{});// Val layout  8x1 n-major
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stale comment in `examples/cute/tutorial/hopper/wgmma_sm90.cu` #3340

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Stale comment in examples/cute/tutorial/hopper/wgmma_sm90.cu #3340

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Stale comment in `examples/cute/tutorial/hopper/wgmma_sm90.cu` #3340