[BUILD] Modularize device runtime into per-backend DSOs#19593
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the TVM runtime build system to split the monolithic runtime into a core library, per-backend dynamic shared objects (DSOs) for CUDA, Vulkan, OpenCL, Metal, ROCm, and Hexagon, and a separate libtvm_runtime_extra library for contrib and disco sources. Correspondingly, the Python loading logic was updated to dynamically discover and load these backend DSOs. Review feedback identified a missing compile definition for the cuDNN frontend and a package naming inconsistency in the library loading utility that could lead to resolution failures.
9c78db7 to
0a12a3c
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors the TVM runtime build system to split backend-specific code (CUDA, Vulkan, OpenCL, Metal, ROCm, Hexagon) and various contrib modules into separate dynamic shared libraries, specifically introducing libtvm_runtime_extra. This change allows for a more modular runtime and dynamic loading of backends in Python. Feedback includes concerns regarding regressions for monolithic Hexagon DSP builds where Disco sources might be missing, the need to gate the creation of the extra runtime library for Hexagon, and improving the robustness of the CMake logic when no extra sources are present.
| include(cmake/modules/contrib/vllm.cmake) | ||
| include(cmake/modules/Git.cmake) | ||
|
|
||
| # ---- libtvm_runtime_extra assembly ---- |
There was a problem hiding this comment.
The libtvm_runtime_extra assembly block should be gated to avoid creating a shared library when building for the Hexagon DSP (BUILD_FOR_HEXAGON). Hexagon DSP builds typically use static linking and a monolithic runtime, and the standard shared library mechanism used here may not be appropriate for the DSP environment.
# ---- libtvm_runtime_extra assembly ----
if(NOT BUILD_FOR_HEXAGON)
029ad2c to
31200ba
Compare
Modularize libtvm_runtime into per-backend shared libraries (libtvm_runtime_cuda, libtvm_runtime_vulkan, etc.) and libtvm_runtime_extra for contrib/disco modules. Each backend can be built independently. During Python import, available backend DSOs are discovered and loaded automatically, with missing backends silently skipped.
31200ba to
3e89882
Compare
|
Reopening from upstream branch to pick up Jenkins groovy changes. |
Summary
Modularize
libtvm_runtimeinto per-backend shared libraries (libtvm_runtime_cuda,libtvm_runtime_vulkan, etc.) andlibtvm_runtime_extrafor contrib/disco modules.USE_CUDA=ONproduces onlylibtvm_runtime.so+libtvm_runtime_cuda.so)libtvm_runtime_extrais always produced if it has sources (disco, contrib modules)USE_CUDA,USE_VULKAN, etc. continue to workTest plan
USE_CUDA=ON— verifylibtvm_runtime.so,libtvm_runtime_cuda.so,libtvm_runtime_extra.soproducedpython -c "import tvm; print(tvm.cuda(0).exist)"printsTruelibtvm_runtime_cuda.soout oflib/— CUDA unavailable, no crash