[release/2.9] add torch.version.rocm, distinct from torch.version.hip #2983
Draft
amd-sriram wants to merge 1 commit intorelease/2.9from
Draft
[release/2.9] add torch.version.rocm, distinct from torch.version.hip #2983amd-sriram wants to merge 1 commit intorelease/2.9from
amd-sriram wants to merge 1 commit intorelease/2.9from
Conversation
…rch#168097) Historically, HIP and ROCm versions were interchangeable, but moving forward these versions are allowed to diverge. ROCm version represents the full ROCm software stack, while HIP is a component of the ROCm stack. Issue pytorch#166068 was fixed by [switching from using HIP_VERSION to ROCM_VERSION_DEV](pytorch#166336). However, this broke the build of ROCm apex because the hip version from `hipcc --version` no longer matched `torch.version.hip`. This highlights the need for both versions to be exposed. Bitsandbytes has also been impacted by the change in behavior of `torch.version.hip`: bitsandbytes-foundation/bitsandbytes#1799 (comment) The solution is to fix the `torch.version.hip` so that it uses the hipcc header values and removes the trailing hash code. In addition, `torch.version.rocm` variable is created to store the ROCm version. HIP_VERSION variable is computed in https://github.com/ROCm/hip/blob/develop/cmake/FindHIP.cmake. This runs hipcc –version and extracts the output of HIP version line, e.g., ``` hipcc --version HIP version: 7.1.25421-32f9fa6ca5 ``` The HIP_VERSION variable may contain a hash code at the end. This trailing hashcode is removed from the HIP_VERSION variable so that the torch.version.hip can be parsed by packaging version parse method, e.g., ``` import torch from packaging import version print(version.parse(torch.version.hip)) ``` Code changes: - Add rocm variable to torch/version.py.tpl - Add code to write rocm variable in tools/generate_torch_version.py - Write rocm version in installation process - torch/CMakeLists.txt Tested on a preview of ROCm 7.2. Successfully built pytorch and apex. Tested above parsing torch.version.hip code. ``` >>> import torch >>> torch.version.hip '7.1.25421' >>> torch.version.rocm '7.2.0' ``` Pull Request resolved: pytorch#168097 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>
|
Jenkins build for 98626a13591ceee88653aaf450d74e82fd0485ad commit finished as FAILURE |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Historically, HIP and ROCm versions were interchangeable, but moving forward these versions are allowed to diverge. ROCm version represents the full ROCm software stack, while HIP is a component of the ROCm stack.
Issue pytorch#166068 was fixed by switching from using HIP_VERSION to ROCM_VERSION_DEV. However, this broke the build of ROCm apex because the hip version from
hipcc --versionno longer matchedtorch.version.hip. This highlights the need for both versions to be exposed.Bitsandbytes has also been impacted by the change in behavior of
torch.version.hip: bitsandbytes-foundation/bitsandbytes#1799 (comment)The solution is to fix the
torch.version.hipso that it uses the hipcc header values and removes the trailing hash code. In addition,torch.version.rocmvariable is created to store the ROCm version.HIP_VERSION variable is computed in https://github.com/ROCm/hip/blob/develop/cmake/FindHIP.cmake. This runs hipcc –version and extracts the output of HIP version line, e.g.,
The HIP_VERSION variable may contain a hash code at the end. This trailing hashcode is removed from the HIP_VERSION variable so that the torch.version.hip can be parsed by packaging version parse method, e.g.,
Code changes:
Tested on a preview of ROCm 7.2. Successfully built pytorch and apex. Tested above parsing torch.version.hip code.
Pull Request resolved: pytorch#168097