python问题中的cuda_Python_Cuda - Fatal编程技术网

python问题中的cuda

python cuda

python问题中的cuda,python,cuda,Python,Cuda,为了发现线程和块的数量并将它们发送到train_kernel函数，我编写了下面的代码 rows = df.shape[0] thread_ct = (gpu.WARP_SIZE, gpu.WARP_SIZE) block_ct = map(lambda x: int(math.ceil(float(x) / thread_ct[0])),[rows,ndims]) train_kernel[block_ct, thread_ct](Xg, yg, syn0g, syn1g, iterations

为了发现线程和块的数量并将它们发送到

train_kernel

函数，我编写了下面的代码

rows = df.shape[0]
thread_ct = (gpu.WARP_SIZE, gpu.WARP_SIZE)
block_ct = map(lambda x: int(math.ceil(float(x) / thread_ct[0])),[rows,ndims])
train_kernel[block_ct, thread_ct](Xg, yg, syn0g, syn1g, iterations)

但在执行之后，我面临以下错误：

griddim必须是一个整数序列

尽管您没有说明这一点，但您显然是在Python3中运行这段代码

在Python 2和Python 3之间，

map

的语义发生了变化。在Python2中，map返回一个列表。在Python3中，它返回一个迭代器。看

要解决此问题，您需要执行以下操作：

block_ct = list(map(lambda x: int(math.ceil(float(x) / thread_ct[0])),[rows,ndims]))

或者，您可以只使用列表理解，而不使用lambda表达式和映射调用：

block_ct = [ int(math.ceil(float(x) / thread_ct[0])) for x in [rows,ndims] ]

这两种方法都会产生一个包含必要元素的列表，这些元素应该在CUDA内核启动调用中起作用。

是什么让您相信传递映射对象会起作用的？我不熟悉python。您建议使用什么替代方法？使用与您使用的python 3类似的构造。映射生成迭代器，而不是列表