Problem
Custom CUDA written incupy enables 3-10x faster computation compared to native pytorch.
For example, using CUDA/widget, it takes about ~1s to disk-VRAM load and visualize all diffraction patterns within a jupyter notebook:

Proposed solution
Support cupy as an optional dependency.
For existing functions and files, we can stick to pytorch internal compute and numpy for user-facing APIs.
For hpc and widget modules, having cupy can be beneficial since it saves human time and enables labs with NVIDIA GPUs to utilize qunatem and their powerful hardware (a.k.a mallard ophus group)
Problem
Custom CUDA written in
cupyenables 3-10x faster computation compared to native pytorch.For example, using CUDA/widget, it takes about ~1s to disk-VRAM load and visualize all diffraction patterns within a jupyter notebook:
Proposed solution
Support
cupyas an optional dependency.For existing functions and files, we can stick to
pytorchinternal compute andnumpyfor user-facing APIs.For
hpcandwidgetmodules, havingcupycan be beneficial since it saves human time and enables labs with NVIDIA GPUs to utilizequnatemand their powerful hardware (a.k.a mallard ophus group)