Skip to content

Add GPU Functionality to the dev Branch#440

Draft
braxtoncuneo wants to merge 11 commits into
mcdc-project:devfrom
braxtoncuneo:gpu-patch
Draft

Add GPU Functionality to the dev Branch#440
braxtoncuneo wants to merge 11 commits into
mcdc-project:devfrom
braxtoncuneo:gpu-patch

Conversation

@braxtoncuneo

@braxtoncuneo braxtoncuneo commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Summary of changes

This PR adds GPU execution to the dev branch. This was accomplished by re-implementing some functionality in a manner compatible with execution through numba-hip and numba-cuda. This includes changes to how gpu state objects are managed, the introduction of helper intrinsics/functions to handle returning arrays, and replacement of builtin functions that are not supported on one or more gpu plaltforms (e.g. powers for complex numbers, numpy polynomial solvers).

Types of changes

New feature: Helper functions/intrinsics for returning arrays on GPU

As has been mentioned in previous issues/PRs, numba-hip and numba-cuda do not permit returning arrays from functions. In brief, this is because memory management on GPU is hard, and the simplest way to create an array as a local variable is to allocate it on the stack. Once the array that created a local array returns, that array no longer "exists" in a way that it can be safely used.

This is a problem, since the helper functions provided by numba_objects_generator.py return references to and slices of arrays in the mcdc global data structure. We'd like to keep those helper functions, considering how much convenience and quality of life they provide to developers. Also, since these global data structures have a lifetime which encompasses the lifetime of all transport logic (i.e. the structures are in memory and safe to use whenever any transport logic is run) it should be safe to ignore this array-returning restriction for these helper functions.

For this specific use case, when we know for certain that an array is safe to use after a function returns, this PR introduces the array_return decorator and the array_result function, which enables the return of array references/slices like so:

# The decorator must be supplied the type of element stored
# by the arrays that are returned by the decorated function.
# 
# An arbitrary number of parameters may be accepted by an
# array-returning function, as long as that number is fixed
# for a given array returning function.
@array_return(nb.types.float64)
def example_function(param_1,param_2,param_3):
    # ...array slicing logic...

    # Returned arrays must always be wrapped in the
    # array_result function.
    return array_result(example_result_array_here)

This works by smuggling a pointer to the array's data and the size of the array out as a tuple, then reconstructing the array based off of the returned tuple in the calling function.

In normal python execution, the array_result function simply returns its input. However, there exists an overload of array_result which will handle the conversion to tuple in compiled contexts.

The array_return decorator registers an intrinsic overload of the decorated function. This intrinsic calls the decorated function and creates a new array based on the returned tuple. Because intrinsics are inlined directly into their calling function, they aren't their own function with a separate stack frame and so are not rejected by the array-returning guards of numba-hip or numba-cuda.

New feature: GPU-compatible quartic solver and complex power calculation

  • Good news: The torus has been added as a new geometric primitive to MC/DC!
  • Bad news: Solving torus/line intersections analytically requires solving quartic functions.
  • Good news: numpy.roots solves polynomials and is implemented by numba in compiled contexts!
  • Bad news: numba-hip and numba-cuda don't implement numpy.roots - likely because the general case does not have analytic solutions and requires the use of arbitrarily large matrixes (determined by the degree of the polynomial).
  • Good news: Quartic functions do have an analytic solution!
  • Bad news: Quartic functions require the calculation of real-valued powers for complex numbers, and numba-hip doesn't implement the intrinsic for powers of complex numbers.
  • Good news: Integer powers of complex numbers can be calculated through iterative multiplication, complex square roots can be solved with square roots of real numbers, and fractional powers of complex numbers can be calculated using de Moivre's Formula!
@njit()
def modulus(x):
    return math.sqrt(x.real**2 + x.imag**2)


@njit()
def sqrt(x):
    return math.sqrt(x.real) * (x + x.real) / modulus(x + x.real)


@njit()
def power(x, n):
    result = 1
    for i in range(n):
        result = result * x
    return result


@njit()
def nth_root(x, n, index):
    # First, convert to polar form
    r = modulus(x)
    a = math.atan2(x.imag, x.real)

    # Apply de Moivre's Formula
    root_modulus = math.pow(r, 1.0 / n)
    root_argument = (a + 2 * math.pi * index) / n

    # ...then convert back to rectangular form
    real = root_modulus * math.cos(root_argument)
    imag = root_modulus * math.sin(root_argument)
    return complex(real, imag)


@njit()
def principal_nth_root(x, n):
    return nth_root(x, n, 0)

New feature: GPU execution

Developer Checklist

  • I have read the contributing guide.
  • My code follows the code style of this project.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests pass

Associated Issues and PRs

  • related to #

Associated Developers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant