test(torch): skip high-memory cuda svd reference by zhangyue207 · Pull Request #628 · InfiniTensor/InfiniOps

zhangyue207 · 2026-05-29T08:21:05Z

Summary

Skip the CUDA svd generated torch-op case for the generic (4, 4, 5632) shape before calling the PyTorch reference.
Keep the rest of the generated torch-op coverage unchanged.

Motivation

NVIDIA CI repeatedly failed in unrelated upstream PRs because the PyTorch svd reference path for (4, 4, 5632) consumed about 72 GiB on 80 GiB runners, leaving too little memory for the remaining pytest workers and causing CUDA OOM. The InfiniOps wrapper is not being exercised meaningfully in that case because the reference path itself is the unstable part.

Closes N/A

Type of Change

feat — new feature / new operator / new platform
fix — bug fix
perf — performance improvement (no behavioral change)
refactor — code restructuring without behavior change
test — adding or fixing tests only
docs — documentation only
build / ci — build system or CI configuration
chore — tooling, formatting, or other non-code changes
Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

Test Results on Supported Platforms

Platform	Built	`pytest` Result	Notes / Hardware
NVIDIA	N/A	Pending CI.	Fix targets repeated CUDA OOM in `tests/test_torch_ops.py::test_op[...,4x4x5632-svd]`.
CPU	N/A	`ruff format --check`, `ruff check`, and `git diff --check` passed.	Python test-only change.

Full `pytest` output (optional)

python -m ruff format --check tests/test_torch_ops.py
# 1 file already formatted

python -m ruff check tests/test_torch_ops.py
# All checks passed!

git diff --check
# passed

Benchmark / Performance Impact

N/A

Notes for Reviewers

The skipped case is limited to cuda, svd, and shape (4, 4, 5632).
This avoids a repeated PyTorch-reference OOM before the generated InfiniOps wrapper comparison can complete.

Checklist

Title, Branch, and Commits

PR title follows Conventional Commits (e.g. feat(nvidia): …, fix(cuda/gemm): …).
Branch name follows <type>/xxx-yyyy-zzzz where <type> matches the PR title's Conventional Commits type and words are joined with hyphens (see CONTRIBUTING.md §Branches).
Each commit message follows Conventional Commits.
Each commit is meaningful, well-formed, and independently reviewable (see CONTRIBUTING.md §Pull Requests).
No stray merge commits from master — the branch is rebased cleanly on top of the current master.
No fixup! / squash! / wip commits remain.

Scope and Design

Changes are minimal — nothing unrelated to the stated motivation was added (CONTRIBUTING.md §Code/General).
No dead code, commented-out blocks, debug prints, printf/std::cout/print(...) left behind, or TODO without an owner and issue link.
No unrelated formatting churn that would obscure the diff.
N/A: Public API changes. This PR changes only test skip policy.

General Code Hygiene (applies to all languages)

The code is self-explanatory; comments were added only where the why is non-obvious (CONTRIBUTING.md §Code/General).
Every modified or added file ends with a single trailing newline (CONTRIBUTING.md §Code/General).
No trailing whitespace, tab/space mixing, or stray BOMs.
Identifiers in comments and error messages are wrapped in backticks where applicable.
All comments and error messages are in English.
Comments and error messages are complete sentences.

C++ Specific (if C++ files changed)

N/A: No C++ files changed.

Python Specific (if Python files changed)

Python files follow PEP 8 and project formatting.
ruff format --check passed for touched Python files.
ruff check passed for touched Python files.

Testing

Fresh platform test run completed. Pending CI.
For any platform that could not be tested, an explicit reason is given in the table.
New functionality has matching tests under tests/. This PR updates an existing generated test harness skip condition.
Pytest parameterization remains deterministic and scoped.
N/A: pytest.mark.auto_act_and_assert. No operator test was added.
N/A: Default dtype / device parameterization. No parameterization was changed.
Flaky test behavior is documented in the PR motivation.
Bug-fix regression test. The skipped case is the repeated OOM case observed in CI.

Build, CI, and Tooling

N/A: Fresh platform build. This PR changes only Python tests.
N/A: compile_commands.json. This PR does not change CMake configuration.
N/A: New backend or device auto-detection. No backend was added.
N/A: CUDA-like mutual exclusion. This PR does not change backend selection.
N/A: CI matrix generation. This PR does not change CI configuration.
N/A: Runtime dependencies. No runtime dependency was added.

Documentation

N/A: User-facing documentation. This PR changes only tests.
N/A: New operators, dispatch helpers, or public utilities. None were added.
N/A: Breaking change. This PR has no user-visible API impact.

Security and Safety

No secrets, access tokens, internal URLs, customer data, or personal hardware identifiers have been committed.
N/A: Third-party code. No third-party code was added.
N/A: Unsafe pointer arithmetic, uninitialized reads, or missing bounds checks. No source code was changed.

test(torch): skip high-memory cuda svd reference

80899dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(torch): skip high-memory cuda svd reference#628

test(torch): skip high-memory cuda svd reference#628
zhangyue207 wants to merge 1 commit into
InfiniTensor:masterfrom
zhangyue207:test/skip-high-memory-torch-svd

zhangyue207 commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhangyue207 commented May 29, 2026

Summary

Motivation

Type of Change

Platforms Affected

Test Results on Supported Platforms

Benchmark / Performance Impact

Notes for Reviewers

Checklist

Title, Branch, and Commits

Scope and Design

General Code Hygiene (applies to all languages)

C++ Specific (if C++ files changed)

Python Specific (if Python files changed)

Testing

Build, CI, and Tooling

Documentation

Security and Safety

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant