Skip to content

Fix CUDA/cuDNN DLL preload paths for CUDA 13 consolidated wheel layout#29202

Merged
tianleiwu merged 6 commits into
mainfrom
tlwu/preload_cuda_cudnn_paths
Jun 26, 2026
Merged

Fix CUDA/cuDNN DLL preload paths for CUDA 13 consolidated wheel layout#29202
tianleiwu merged 6 commits into
mainfrom
tlwu/preload_cuda_cudnn_paths

Conversation

@tianleiwu

@tianleiwu tianleiwu commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Description

Fix #29198.

NVIDIA restructured the CUDA Python wheels starting with CUDA 13: the per-component CUDA Toolkit packages (cublas, cufft, cuda_runtime, cuda_nvrtc, curand, ...) were consolidated into a single nvidia/cu{major} package and the -cuNN suffix was dropped from those package names. This PR updates the DLL/shared-library preload logic and the wheel dependency metadata so onnxruntime-gpu (and onnxruntime-trt-rtx) keep working on both the legacy CUDA 12 layout and the new CUDA 13 consolidated layout.

Summary of Changes

Preload logic (onnxruntime/__init__.py)

File Change
onnxruntime/__init__.py _get_nvidia_dll_paths now detects the CUDA 13+ consolidated layout and resolves CUDA libraries under nvidia/cu{major} — Windows uses an architecture sub-folder (bin/<arch>, e.g. bin/x86_64), Linux uses a flat lib. The legacy CUDA 12 per-component paths are preserved.
onnxruntime/__init__.py Added build_cuda_version and arch parameters (for testability/arch override); cuDNN paths factored out since cuDNN keeps its own nvidia/cudnn package layout in both schemes.
onnxruntime/__init__.py print_debug_info drops the -cuNN suffix from CUDA Toolkit package names for CUDA 13+ (cuDNN keeps its suffixed name).

Wheel dependency metadata (setup.py)

File Change
setup.py onnxruntime-gpu cuda extras drop the -cuNN suffix for CUDA 13+ (nvidia-cuda-nvrtc, nvidia-cuda-runtime, nvidia-cufft, nvidia-curand); cuDNN dependency keeps the suffixed name.
setup.py onnxruntime-trt-rtx CUDA Runtime dependency drops the -cuNN suffix for CUDA 13+.

Tests (onnxruntime/test/python/onnxruntime_test_python_preload_dlls.py)

  • New unit tests pin the expected relative paths for the CUDA 12 (legacy) and CUDA 13 (consolidated) layouts on both Windows and Linux, the Windows arch override, the Linux flat-lib layout, the unchanged cuDNN layout, and the cuda/cudnn toggles.

Testing

  • Run the new tests: python -m pytest onnxruntime/test/python/onnxruntime_test_python_preload_dlls.py (or python -m unittest onnxruntime.test.python.onnxruntime_test_python_preload_dlls).
  • Backward compatibility: CUDA 12 paths and the cuDNN layout are unchanged; only CUDA 13+ takes the new consolidated paths and unsuffixed package names.
  • Build in Linux and Windows, and pip install onnxruntime-gpu*.whl[cuda,cudnn], then import onnxruntime; onnxruntime.preload_dlls() can run successfully in python.

Checklist

  • Tests added/updated
  • No breaking changes (CUDA 12 behavior preserved)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates ONNX Runtime’s Python CUDA/cuDNN preload behavior and wheel dependency metadata to support NVIDIA’s CUDA 13+ consolidated Python wheel layout (nvidia/cu{major}) while preserving the legacy CUDA 12 per-component layout.

Changes:

  • Updated onnxruntime.__init__._get_nvidia_dll_paths() to resolve CUDA libraries from consolidated nvidia/cu{major} paths for CUDA 13+ (with Windows bin/<arch> handling) while keeping CUDA 12 behavior intact.
  • Adjusted setup.py extras/dependencies to use unsuffixed NVIDIA CUDA toolkit package names for CUDA 13+ (cuDNN remains suffixed).
  • Added Python unit tests that pin expected relative preload paths for both CUDA 12 and CUDA 13 layouts across Windows/Linux.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
onnxruntime/__init__.py Detect CUDA 13+ consolidated wheel layout for preload paths; adjust debug package name reporting accordingly.
setup.py Update CUDA-related dependency names/extras for CUDA 13 consolidated wheels; keep cuDNN naming scheme.
onnxruntime/test/python/onnxruntime_test_python_preload_dlls.py Add unit tests covering legacy vs consolidated CUDA layouts and cudnn/cuda toggles.
cmake/onnxruntime_providers_cuda.cmake Narrow/rename CCCL header patch workaround to CUDA 13.3-specific logic on UNIX.

Comment thread setup.py
Comment thread onnxruntime/__init__.py Outdated
Comment thread onnxruntime/__init__.py Outdated

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Comment thread onnxruntime/__init__.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread onnxruntime/__init__.py
@tianleiwu tianleiwu merged commit 37cccc8 into main Jun 26, 2026
86 checks passed
@tianleiwu tianleiwu deleted the tlwu/preload_cuda_cudnn_paths branch June 26, 2026 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

preload_dlls() does not look at the correct nvidia site packages location.

4 participants