Python 编程嵌套的numba.cuda函数调用_Python_Cuda_Numba

Python 编程嵌套的numba.cuda函数调用

python cuda

Python 编程嵌套的numba.cuda函数调用,python,cuda,numba,Python,Cuda,Numba,这里是Numba和CUDA noob。我希望能够让一个numba.cuda函数以编程方式从设备调用另一个，而不必将任何数据传递回主机。例如，给定设置 from numba import cuda @cuda.jit('int32(int32)', device=True) def a(x): return x+1 @cuda.jit('int32(int32)', device=True) def b(x): return 2*x 我希望能够定义一个组合核函数，如 @cud

这里是Numba和CUDA noob。我希望能够让一个

numba.cuda

函数以编程方式从设备调用另一个，而不必将任何数据传递回主机。例如，给定设置

from numba import cuda

@cuda.jit('int32(int32)', device=True)
def a(x):
    return x+1

@cuda.jit('int32(int32)', device=True)
def b(x):
    return 2*x

我希望能够定义一个组合核函数，如

@cuda.jit('void(int32, __device__, int32)')
def b_comp(x, inner, result):
    y = inner(x)
    result = b(y)

并成功获得

b_comp(1, a, result)
assert result == 4

理想情况下，我希望

b_comp

在编译后接受不同的函数参数[例如，在上面的调用之后，仍然接受

b_comp（1，b，result）

]——但是在编译时函数参数变为固定的解决方案仍然适用于我

据我所知，CUDA似乎支持传递函数指针。这表明

numba.cuda

没有这样的支持，但这篇文章没有说服力，而且已经有一年了。的页面没有提到函数指针支持。但是它链接到该页面，这表明

numba.jit（）

确实支持函数作为参数，尽管它们在编译时被修复。如果

numba.cuda.jit（）。在这种情况下，当为comp
指定签名时，我应该如何声明变量类型？或者我可以使用numba.cuda.autojit（）

如果numba
不支持任何此类直接方法，元编程是合理的选择吗？例如，一旦我知道了内部的函数，我的脚本就可以创建一个新脚本，其中包含一个组成这些特定函数的python函数，然后应用numba.cuda.jit（）
，然后导入结果。这看起来很复杂，但这是我唯一能想到的另一个基于numba
的选项
如果numba
根本做不到这一点，或者至少没有严重的麻烦，我很乐意给出一个给出一些细节的答案，再加上一个类似rec的“切换到PyCuda”。
以下是对我有效的方法：
最初不使用cuda.jit
装饰我的函数，这样它们仍然拥有\uuuu name\uuu
属性
获取\uuuu name\uuuu
属性
现在通过直接调用decorator将cuda.jit
应用到我的函数中
在字符串中为合成函数创建python，并将其传递给exec
确切代码：
from numba import cuda
import numpy as np


def a(x):
    return x+1

def b(x):
    return 2*x


# Here, pretend we've been passed the inner function and the outer function as arguments
inner_fun = a
outer_fun = b

# And pretend we have noooooo idea what functions these guys actually point to
inner_name = inner_fun.__name__
outer_name = outer_fun.__name__

# Now manually apply the decorator
a = cuda.jit('int32(int32)', device=True)(a)
b = cuda.jit('int32(int32)', device=True)(b)

# Now construct the definition string for the composition function, and exec it.
exec_string = '@cuda.jit(\'void(int32, int32[:])\')\n' \
              'def custom_comp(x, out_array):\n' \
              '    out_array[0]=' + outer_name + '(' + inner_name + '(x))\n'

exec(exec_string)

out_array = np.array([-1])
custom_comp(1, out_array)
print(out_array)

正如预期的那样，输出是
[4]