Fix on TE to support Mcore Vision Encoder CUDA Graph#2657
Open
tomlifu wants to merge 2 commits intoNVIDIA:mainfrom
Open
Fix on TE to support Mcore Vision Encoder CUDA Graph#2657tomlifu wants to merge 2 commits intoNVIDIA:mainfrom
tomlifu wants to merge 2 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Lifu Zhang <lifuz@login-lyris02.lyris.clusters.nvidia.com>
for more information, see https://pre-commit.ci
Contributor
Greptile OverviewGreptile SummaryThis PR adds None-safety checks throughout the CUDA Graph capture code to support vision encoder modules. The changes prevent
The fix is minimal and surgical, adding safety checks without changing the underlying logic or control flow. Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant make_graphed_callables
participant _make_graphed_callables
participant Forward Graph
participant Backward Graph
participant Module
User->>make_graphed_callables: Call with modules & sample_args
make_graphed_callables->>_make_graphed_callables: Pass callables & args
Note over _make_graphed_callables: Warmup Phase
_make_graphed_callables->>Module: Run warmup iterations
Module-->>_make_graphed_callables: Return outputs (may contain None)
Note over _make_graphed_callables: Graph Capture Phase
_make_graphed_callables->>Forward Graph: Capture forward pass
Module-->>Forward Graph: Store static outputs
Note over _make_graphed_callables: Filter outputs with None check
_make_graphed_callables->>_make_graphed_callables: Check "o is not None and o.requires_grad"
_make_graphed_callables->>Backward Graph: Capture backward pass
_make_graphed_callables->>Backward Graph: Create grad tensors for valid outputs
Note over _make_graphed_callables: Graph Replay Phase
_make_graphed_callables->>User: Return graphed callables
User->>Forward Graph: Call graphed module
Forward Graph->>Forward Graph: Replay captured graph
Forward Graph-->>User: Return detached outputs (None-safe)
User->>Backward Graph: Trigger backward
Backward Graph->>Backward Graph: Replay captured backward
|
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR is needed to support vision encoder CUDA Graph.
Related MLM PR: NVIDIA/Megatron-LM#3293, NVIDIA/Megatron-LM#3294
Fixes # (issue)
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: