Recover Conv/ConvTranspose rank from weight when input shape is unknown#29149
Recover Conv/ConvTranspose rank from weight when input shape is unknown#29149fanchenkong1 wants to merge 6 commits into
Conversation
The layout transformer skips converting a node to NHWC when input[0] has no inferred shape. But the NCHW<->NHWC permutation depends only on rank. For Conv/ConvTranspose the data input and weight share the same rank, so when input[0]'s rank is unknown, recover it from the weight at input[1].
qjia7
left a comment
There was a problem hiding this comment.
Correctness
The change is sound. Per ONNX spec, Conv and ConvTranspose require W (input[1])
at the same rank as X (input[0]) — [M, C/group, k1..kn] vs [N, C, d1..dn].
Falling back to the weight's rank when the data input's rank is unknown is safe.
Downstream (ChannelFirstToLastPerm / ChannelLastToFirstPerm) only needs the
rank, not the full shape. FusedConv is covered via the existing op_type
normalization to "Conv".
Simplicity
The defensive guard node->Inputs().size() > 1 && !node->Inputs()[1].empty() is
redundant. Per ONNX spec, W is a mandatory input for both Conv and
ConvTranspose — a node missing it would already be malformed and rejected
upstream. The empty-string convention only applies to optional inputs (like
B).
Suggest simplifying to:
if (!input_rank.has_value() && (op_type == "Conv" || op_type == "ConvTranspose")) {
input_rank = api_graph->GetValueInfo(node->Inputs()[1])->ShapeRank();
}Using ShapeRank() over Shape()->size() is the right API choice.
Security
No new attack surface. Reads an existing graph value-info; no allocation, no
unchecked arithmetic.
Testing
A unit test that constructs a Conv with unknown input[0] rank but known weight
rank, runs the layout transformer, and asserts the transpose is inserted would
lock the behavior in.
Verdict
Approve. One optional simplification (drop the redundant guard) and an optional
test.
@qjia7, addressed your comments. PTAL, thanks! |
There was a problem hiding this comment.
Pull request overview
This PR improves the layout transformation pass so it can still convert Conv/ConvTranspose nodes to NHWC even when the data input shape (input[0]) has no inferred rank, by recovering the rank from the weight input (input[1]). This enables more nodes to be transformed and allows downstream transpose optimization to reduce overhead, particularly benefiting WebGPU.
Changes:
- Update layout transformation to use
ShapeRank()and, forConv/ConvTranspose, fall back to the weight’s rank when the data input rank is unknown. - Add a unit test that constructs a
Convwith cleared input shape and verifies layout transformation proceeds (via insertedTransposenodes).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| onnxruntime/core/optimizer/layout_transformation/layout_transformation.cc | Recover Conv/ConvTranspose rank from weight when input rank is unknown, enabling NHWC conversion in more cases. |
| onnxruntime/test/optimizer/transpose_optimizer_test.cc | Adds coverage verifying rank recovery from weights allows layout transformation to insert transposes for a Conv with unknown input shape. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
The Windows GPU CUDA CI Pipeline Test Job encountered a network error when trying to download a test model, which seems to be a infra-related failure. |
Recover Conv/ConvTranspose rank from weight when input shape is unknown, enabling layout transformation to NHWC for more nodes.
Description
The layout transformer skips converting a node to NHWC when input[0] has no inferred shape.
For Conv and ConvTranspose operators, the data input (input[0]) and the weight (input[1]) always share the same rank. When the input rank is unknown, recover it from the weight.
Performance Impact
Measured on Kokoro-82M-v1.0-ONNX text-to-speech model (onnx-community/Kokoro-82M-v1.0-ONNX) with WebGPU ep,
This change yields a 1.2–1.5× speedup on the Kokoro-82M text-to-speech model.