[ROCM] Add rocjpeg support for GPU image decoding by xytpai · Pull Request #9342 · pytorch/vision

xytpai · 2026-01-16T10:12:28Z

Adds a rocJPEG-backed GPU JPEG decoding path for ROCm builds, gated by TORCHVISION_USE_ROCJPEG / ROCJPEG_FOUND, alongside the existing nvJPEG CUDA path.

This enables torchvision.io.decode_jpeg(..., device="cuda") on ROCm for supported JPEGs, including RGB, GRAY, and UNCHANGED modes. The implementation uses rocJPEG's hardware backend, handles rocJPEG errors through TORCH_CHECK, avoids process termination on decode failures, and returns contiguous output tensors for parity with nvJPEG.

If you meet any libva related compile errors:

sudo apt install vainfo libva2 libva-drm2 libdrm2 mesa-va-drivers libva-dev

Note

rocJPEG currently uses ROCJPEG_BACKEND_HARDWARE, which has stricter capability limits than the CPU/nvJPEG path: it rejects images smaller than 64x64 and unsupported chroma subsampling such as 4:1:1/unknown. If rocJPEG hardware doesn't support the input, fallback to CPU decode and copy result to CUDA.

cc @jeffdaily @jithunnair-amd

pytorch-bot · 2026-01-16T10:12:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9342

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[ROCm] MI350 CI jobs will have longer queue times due to CI migration

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2026-01-16T10:12:34Z

Hi @xytpai!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

xytpai · 2026-01-30T08:18:19Z

@NicolasHug

NicolasHug · 2026-02-09T13:30:20Z

Hi @xytpai , thanks for the PR.
I don't think this is something we'll have the bandwidth to prioritize at this moment, although this might eventually become in scope for TorchCodec, e.g. via the "third-party TC extensions" that we plan to support (where decoder extensions could be added in third-parties repos, out of TC's core).

May I ask, is there already a Python library that exposes the Rocm jpeg decoder?

xytpai · 2026-03-18T07:49:54Z

Hi, @NicolasHug thanks for the feedback!

I understand that this may not be a priority at the moment. I'd like to ask if we could consider merging this in its current form to reach a functionally ready state. This would allow the ROCm JPEG decoding path to be available for users and contributors to build upon. We're happy to address any remaining review comments and ensure the code meets the project's quality standards. Please let us know if there's anything we can do to move this forward. Thanks!

May I ask, is there already a Python library that exposes the Rocm jpeg decoder?

Yes, there is already a Python library that exposes the ROCm JPEG decoder. https://github.com/ROCm/rocPyDecode

xytpai · 2026-04-15T09:58:55Z

@NicolasHug

jeffdaily

Review — [ROCM] Add rocjpeg support for GPU image decoding (RGB output)

Summary

Adds a rocJPEG-backed decode_jpegs_cuda (RGB-only) alongside the existing nvJPEG path, gated by a new ROCJPEG_FOUND macro and TORCHVISION_USE_ROCJPEG build flag. The implementation is largely an AMD rocJPEG sample grafted onto the nvJPEG wrapper, and several pieces of the nvJPEG scaffolding were copied without being wired up. There are correctness and robustness issues that block approval, plus no test coverage for the reduced-capability backend.

Code Quality

CHECK_ROCJPEG / CHECK_HIP call exit(1) on failure (decode_jpegs_cuda.h, new macros). A library must never terminate the host process. Any malformed/unsupported JPEG, OOM, or transient HIP error would kill the user's entire Python interpreter with no traceback and no chance to try/except. The nvJPEG path uses TORCH_CHECK, which throws a catchable c10::Error. These macros must use TORCH_CHECK(status == ROCJPEG_STATUS_SUCCESS, ...) / TORCH_CHECK(hip_status == hipSuccess, ...) instead of std::cerr/std::cout + exit(1). This is the most serious issue.
Stray std::cout diagnostics remain. getChannelPitchAndSizes prints "Unknown chroma subsampling!" and "Unknown output format!" to stdout before returning EXIT_FAILURE. A "rm cout" commit is in the history but missed these. Replace with TORCH_CHECK(false, ...) so the message surfaces as an exception rather than polluting stdout.
Dead code / unused locals in decode_images:
- getChromaSubsamplingStr(...) fills chroma_sub_sampling, which is never read.
- num_components (from rocJpegGetImageInfo) is never read.
- std::vector<int> channels(num_images) and channels[j] = num_channels are never used afterward.
- prior_channel_sizes is declared and sized but never used.
- The commented-out // if (current_batch_size == 2) { should be removed, not left in.
Given only ROCJPEG_OUTPUT_RGB_PLANAR is reachable (the dispatcher rejects every other mode), the entire multi-format switch in getChannelPitchAndSizes and the channel_sizes machinery is dead for this PR. Either reduce it to the RGB-planar case or add a comment that it is staged for future formats — as written it is a large amount of untested, unreachable code.
Copy-pasted nvJPEG error string: "The input tensor must be on CPU when decoding with nvjpeg" should say rocJPEG. Same for the decode_images doc comment, which lists output_format ... ROCJPEG_OUTPUT_RGB and a device argument the function does not take.

Correctness / Thread Safety

The stream/event synchronization is copy-pasted from nvJPEG but never actually connected to rocJPEG. In the nvJPEG path, nvjpegDecodeBatched(..., stream) enqueues work on decoder->stream, so event.record(stream); event.block(current_stream) and the cudaStreamSynchronize(stream) calls are meaningful. Here, rocJpegDecodeBatched takes no stream argument and runs on rocJPEG's own internal stream; the decoder's stream member is never passed to any rocJPEG call. Consequently:
- Both cudaStreamSynchronize(stream) calls in decode_images synchronize an unrelated, idle pool stream — no-ops.
- event.record(rocJpegDecoder->stream) records on that idle stream, so event.block(current_stream) blocks on nothing.
This is only safe if rocJpegDecodeBatched is fully host-synchronous. If it is, all of this scaffolding is dead and misleading and should be removed (or replaced with a single hipStreamSynchronize on rocJPEG's actual stream). If it is not host-synchronous, then the decoded tensors are returned to the caller's current stream with no real dependency edge, which is a data race / silent corruption. Please confirm rocJPEG's completion semantics and either remove the ineffective scaffolding or add the correct synchronization.

Testing

No tests added, and the existing GPU test will fail on ROCm. test_decode_jpegs_cuda (test/test_image.py) is gated only by @needs_cuda, which is true under ROCm (torch.cuda.is_available()), and it parametrizes over mode in {UNCHANGED, GRAY, RGB}. This backend only supports RGB; GRAY/UNCHANGED hit TORCH_CHECK(false, "mode is not supported for ROCJPEG decoding on GPU") and the test errors. The test also decodes every .jpg asset and compares against CPU — any asset smaller than 64x64 or using 4:1:1 subsampling will trip the new TORCH_CHECKs ("not supported by VCN Hardware"). The PR must either restrict/parametrize these tests for the rocJPEG capability set or add a dedicated rocJPEG test path. Shipping a backend that breaks the existing CUDA test suite on ROCm is a blocking gap.

API Design / Backward Compatibility

Silent capability regression vs. the nvJPEG/CPU path. On CUDA, decode_jpeg(..., device='cuda') supports GRAY/UNCHANGED and arbitrary image sizes (falling back to a non-hardware backend when needed). The rocJPEG path uses ROCJPEG_BACKEND_HARDWARE only, with no software fallback, and hard-errors on images <64px in either dimension and on 4:1:1/unknown subsampling. The same user code that works on CUDA/CPU will raise on ROCm for valid JPEGs. At minimum document this; ideally fall back to a non-hardware backend or to CPU decode for unsupported inputs.
Returned tensors are non-contiguous strided views, unlike nvJPEG. The output tensor is allocated at 16-aligned height/width and then narrow-ed back to the true dimensions, so for widths not divisible by 16 the returned tensor has a row stride > width. The nvJPEG path returns exact-size contiguous tensors. Downstream code or tests that assume .is_contiguous() will behave differently across backends. Consider returning .contiguous() for parity.

setup.py

USE_ROCJPEG defaults to "1", so it is enabled on every build including NVIDIA/CPU. The nvjpeg_found and USE_NVJPEG branch is checked first so an NVIDIA box still builds nvJPEG, and rocjpeg_found requires ROCM_HOME + the header, so this is functionally fine — but the combined warning "Building torchvision without NVJPEG or ROCJPEG support" will now fire on plain CUDA builds where only nvJPEG was ever expected. Minor, but worth confirming the messaging is intentional.

Recommendation

Request Changes. Blocking: (1) exit(1) in the CHECK_* macros must become TORCH_CHECK; (2) the stream/event synchronization is ineffective as wired — confirm rocJPEG sync semantics and either remove it or synchronize correctly; (3) the existing CUDA test suite breaks on ROCm and no rocJPEG-appropriate tests were added. The dead code, stray std::cout, copy-pasted error strings, and the documented capability/contiguity differences should also be addressed before merge.

🤖 Generated with Claude Code

rrawther

@xytpai: I have added some review comments for this PR. Also, please address the comments from @jeffdaily too.

pytorch-bot · 2026-06-13T14:57:30Z

The following ciflow label(s) have been added but CI has not been triggered yet because the workflows are awaiting approval:

ciflow/rocm

Once a maintainer approves the workflows (scroll to the bottom of the PR page), the corresponding CI jobs will be triggered automatically. Please ping one of the reviewers if you do not have access to approve and run workflows.

jeffdaily

Correctness

decode_jpegs_cuda.cpp (~720-770): buffer sizing and decode format disagree for UNCHANGED + 4:2:2 color. In ROCJPEG_OUTPUT_NATIVE, num_channels/pitch are computed from the source chroma subsampling, but image_output_format is then overridden to Y/RGB_PLANAR based on num_components. For a color image (num_components == 3) with ROCJPEG_CSS_422, the switch sets num_channels = 1, pitch = align_up(width*2, 16), yet the decode is issued as RGB_PLANAR, which needs three full-resolution width-byte planes:
- Buffer is one plane; the for (c < num_channels) loop sets only channel[0], leaving channel[1]/channel[2] null, so rocJpegDecodeBatched writes RGB planes through null pointers (HIP fault / OOB).
- Even absent the fault, the returned tensor is [1, h, w] instead of [3, h, w].
The native-subsampling switch describes packed-YUV layouts that are never decoded (the code always remaps NATIVE to Y/RGB_PLANAR). Determine image_output_format first, then size from it: Y -> 1 channel, RGB_PLANAR -> 3 channels, both pitch = align_up(width, 16). Delete the inner subsampling switch (including the CSS_422 width*2 pitch). CI misses this because every tested asset is 4:2:0 or grayscale; 4:2:2 is a common encoder default.

Testing

No coverage for rocJPEG-specific paths. test_decode_jpegs_cuda only decodes 4:2:0/grayscale assets, so the 4:2:2 UNCHANGED bug, the 64x64 minimum, and the 4:1:1/unknown rejection are all unexercised on ROCm. Add a 4:2:2 color asset (or a dedicated rocJPEG test).

API Design / Backward Compatibility

Capability regression coupled to the shared test. The rocJPEG path is ROCJPEG_BACKEND_HARDWARE only and hard-errors on <64px images and 4:1:1/unknown subsampling, with no fallback; the same decode_jpeg(..., device="cuda") that works on CUDA/CPU raises on ROCm. test_decode_jpegs_cuda globs all non-damaged .jpg assets, so any future sub-64px or 4:1:1 asset breaks ROCm CI. Prefer a fallback or capability skip over a hard TORCH_CHECK; at minimum document it.

Code Quality

decode_jpegs_cuda.cpp:683,770 -- source_channels is write-only. Assigned source_channels[i] = num_components, never read. Remove it (and the num_components capture if unused after the fix).
TORCH_CHECK among STD_TORCH_CHECK. The two format-switch default: branches use TORCH_CHECK(false, ...) while the rest of the new code uses STD_TORCH_CHECK. Make them consistent.

Thread Safety (low confidence -- confirm)

Trailing .contiguous() loop assumes rocJpegDecodeBatched is host-synchronous. Output tensors are narrow-ed views copied via .contiguous() on the caller's current stream with no stream/event dependency on the decode. Safe only if rocJpegDecodeBatched blocks the host until device writes complete; please confirm, otherwise the copy can race the decode.

Recommendation

Request Changes. Blocking: the 4:2:2-color UNCHANGED buffer/format mismatch and the missing test coverage that lets it ship undetected. The dead source_channels, the TORCH_CHECK inconsistency, the capability-regression coupling, and the host-sync confirmation should also be addressed.

🤖 Generated with Claude Code

xytpai · 2026-06-18T17:18:18Z

rocJPEG currently uses ROCJPEG_BACKEND_HARDWARE, which has stricter capability limits than the CPU/nvJPEG path: it rejects images smaller than 64x64 and unsupported chroma subsampling such as 4:1:1/unknown. If rocJPEG hardware doesn't support the input, fallback to CPU decode and copy result to CUDA.

NicolasHug

Thank folks, I made a quick pass. Noting that we're in the process of migrating the torchvision codebase to torch stable ABI, which might require updates to this PR as well

NicolasHug · 2026-06-18T19:34:01Z

@@ -1,5 +1,5 @@
 #include "decode_jpegs_cuda.h"
-#if !NVJPEG_FOUND
+#if !NVJPEG_FOUND && !ROCJPEG_FOUND


in this Cpp file and in the header, do we need to have both the nvjpeg and rocm implementation together?

Unless they're sharing a lot of code (which they don't seem to, but I may have missed it), then let's keep them in separate files. If there are common stateless utilities they can go in a common file, otherwise let's not touch the nvjpeg implementation please.

I kept them together because CUDA and ROCm builds compile only one backend via NVJPEG_FOUND / ROCJPEG_FOUND, and the shared code is the entry-point validation, device guard, mutex, and decoder lifetime handling. The backend-specific implementations are still isolated by preprocessor guards. I can split rocJPEG out if you strongly prefer, but my preference is to avoid extra build routing and duplicated entry-point logic for now.

yes, I think I'd prefer having the ROCm code in a ROCm specific file. The common decode_jpegs_cuda() can be moved to a common file too.

NicolasHug · 2026-06-18T19:46:27Z

+  }
+}
+
+void RocJpegDecoder::ensure_stream_handles(std::size_t num_handles) {


can you explain what this does? Does this "ensure the stream handles" or does it create them?

This helper reuses the cached rocJPEG stream handles and only creates the missing ones when the requested batch size grows. I renamed it to ensure_stream_handle_count and added a comment to make that behavior clearer.

NicolasHug · 2026-06-18T19:53:58Z

+        {int64_t(num_channels),
+         int64_t(align_up(height, kRocJpegPitchAlignment)),
+         int64_t(pitch)},
+        torch::dtype(torch::kU8).device(target_device));


what's the memory ownership story for buffer and output_tensors[i]? How do they know to free the underlying rocm memory when they go out of scope? How do we ensure the the underlying memory is never freed as long as output_tensors[i] is alive?

buffer is a torch::Tensor, and output_tensors[i] is a view created from it via narrow(). PyTorch tensor views share the same underlying Storage, so the returned view keeps the storage alive even after the local buffer variable goes out of scope.

output_images[i] only holds raw pointers temporarily for the rocJpegDecodeBatched call. Since rocJPEG has completed the writes before the function returns, it does not own or retain those pointers afterwards. The actual memory lifetime is owned by the returned tensors.

NicolasHug · 2026-06-18T21:05:24Z

Also what's the testing situation here? We don't have Rocm test runners in torchvision I believe. Is AMD testing these out-of-core somehow?

xytpai · 2026-06-22T06:11:59Z

Also what's the testing situation here? We don't have Rocm test runners in torchvision I believe. Is AMD testing these out-of-core somehow?

Yes, torchvision CI does not currently have ROCm image decode runners, so this path is not covered by upstream CI yet.

On the AMD side, we are validating this out-of-core on ROCm hardware with the existing test_decode_jpegs_cuda coverage, including UNCHANGED, GRAY, and RGB modes, and comparing against the CPU decode outputs.

NicolasHug · 2026-06-22T12:53:36Z

@@ -1,5 +1,5 @@
 #include "decode_jpegs_cuda.h"
-#if !NVJPEG_FOUND
+#if !NVJPEG_FOUND && !ROCJPEG_FOUND


yes, I think I'd prefer having the ROCm code in a ROCm specific file. The common decode_jpegs_cuda() can be moved to a common file too.

NicolasHug · 2026-06-22T13:05:39Z

        else:
+            warnings.warn("Building torchvision without ROCJPEG support")
+    else:
+        if USE_NVJPEG and (torch.cuda.is_available() or FORCE_CUDA):


Can we just leave the previous if USE_NVJPEG and (torch.cuda.is_available() or FORCE_CUDA): block exactly like it was, and just have a separate (indepentent) ROCm-specific block below it? They should be mutually exclusive?

I don't think we can rely on USE_NVJPEG and USE_ROCJPEG being mutually exclusive because both default to true. On ROCm builds, the nvJPEG block would still run and warn unless it is guarded by not IS_ROCM. I think the mutual exclusion should be based on the backend (not IS_ROCM for nvJPEG, IS_ROCM for rocJPEG), not on the two USE flags.

* Update decode_jpegs_cuda.cpp * Update decode_jpegs_cuda.h * split code

NicolasHug · 2026-06-24T08:51:08Z

Thanks for the PR @xytpai , this LGTM so far. Just giving you a heads up, before moving forward I'd like to:

get a small ROCm unit-test CI job within the torchvision repo so that this can be tested here. I opened Add unittest rocm job #9527 for that, I'm currently waiting on getting permission.
port the CUDA jpeg decoder to the stable ABI. We started some effort to port the whole torchvision codebase to rely on the torch stable ABI (Stand up dual stable-ABI build harness and port nms as the first op. #9524) - I'd like to port the CUDA jpeg decoder next so that we can the update this ROCm decoder PR to directly rely on the stable ABI (rather than having to port it later) - CC @adabeyta

xytpai added 6 commits January 8, 2026 03:18

add rocjpeg support

f4fc92e

update rocjpeg utils

a371c3e

rm cout

e4c4fd0

refine code

3d9041c

rm unused file

1d29986

refine code 2

15d8f11

meta-cla Bot added the cla signed label Jan 26, 2026

xytpai and others added 3 commits January 27, 2026 12:14

Merge branch 'main' into xyt/rocjpeg_upstream

fb2f9fc

Merge branch 'main' into xyt/rocjpeg_upstream

bc8c702

Merge branch 'main' into xyt/rocjpeg_upstream

173c23d

xytpai changed the title ~~Add rocjpeg support for image decoding~~ [ROCM] Add rocjpeg support for GPU image decoding (RGB output) Jan 30, 2026

pytorch-bot Bot added the module: rocm label Jan 30, 2026

Merge branch 'main' into xyt/rocjpeg_upstream

09c589d

wuhuikx mentioned this pull request Feb 9, 2026

[Tracking][Performance][AMD] Qwen3 & Qwen3-VL Latency Optimization on AMD MI300 Series GPUs sgl-project/sglang#18466

Closed

jeffdaily suggested changes Jun 5, 2026

View reviewed changes

rrawther reviewed Jun 5, 2026

View reviewed changes

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp

rrawther reviewed Jun 5, 2026

View reviewed changes

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp Outdated

rrawther reviewed Jun 5, 2026

View reviewed changes

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp Outdated

rrawther reviewed Jun 5, 2026

View reviewed changes

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp Outdated

rrawther suggested changes Jun 5, 2026

View reviewed changes

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp Outdated

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp Outdated

Merge branch 'main' into xyt/rocjpeg_upstream

8d4f6ff

pytorch-bot Bot added the ciflow/rocm label Jun 12, 2026

full format support

b68f0ef

jeffdaily suggested changes Jun 18, 2026

View reviewed changes

rrawther reviewed Jun 18, 2026

View reviewed changes

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp Outdated

rrawther suggested changes Jun 18, 2026

View reviewed changes

resolve comments

248894c

xytpai requested review from jeffdaily and rrawther June 18, 2026 17:20

apply clang-format

802cac2

jeffdaily approved these changes Jun 18, 2026

View reviewed changes

NicolasHug reviewed Jun 18, 2026

View reviewed changes

xytpai added 3 commits June 19, 2026 03:12

Separate rocJPEG and nvJPEG setup blocks

d942228

add _ suffix for private class members

7581393

just return padded tensor in its original layout

a4073b0

AryanSalmanpour suggested changes Jun 19, 2026

View reviewed changes

Comment thread torchvision/csrc/io/image/cuda/decode_jpegs_cuda.cpp Outdated

xytpai added 3 commits June 22, 2026 05:30

rm unnecessary sync

a2572c8

refine code

b413e54

add rocjpeg doc link

0fe060a

xytpai requested review from AryanSalmanpour, NicolasHug and jeffdaily June 22, 2026 06:13

NicolasHug reviewed Jun 22, 2026

View reviewed changes

xytpai added 3 commits June 23, 2026 06:30

refine code

2e4047d

Split rocjpeg code (#3)

156b4ec

* Update decode_jpegs_cuda.cpp * Update decode_jpegs_cuda.h * split code

Merge branch 'main' into xyt/rocjpeg_upstream

be0b6e5

xytpai requested a review from NicolasHug June 23, 2026 07:31

This comment was marked as spam.

Sign in to view

Uh oh!

Conversation

xytpai commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Note

Uh oh!

pytorch-bot Bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9342

❗ 1 Active SEVs

Uh oh!

meta-cla Bot commented Jan 16, 2026

Action Required

Process

Uh oh!

xytpai commented Jan 30, 2026

Uh oh!

NicolasHug commented Feb 9, 2026

Uh oh!

xytpai commented Mar 18, 2026

Uh oh!

xytpai commented Apr 15, 2026

Uh oh!

jeffdaily left a comment

Choose a reason for hiding this comment

Review — [ROCM] Add rocjpeg support for GPU image decoding (RGB output)

Summary

Code Quality

Correctness / Thread Safety

Testing

API Design / Backward Compatibility

setup.py

Recommendation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rrawther left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pytorch-bot Bot commented Jun 13, 2026

Uh oh!

jeffdaily left a comment

Choose a reason for hiding this comment

Correctness

Testing

API Design / Backward Compatibility

Code Quality

Thread Safety (low confidence -- confirm)

Recommendation

Uh oh!

Uh oh!

xytpai commented Jun 18, 2026

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xytpai commented Jan 16, 2026 •

edited

Loading

pytorch-bot Bot commented Jan 16, 2026 •

edited

Loading