Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/325.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
运行python numbapro时出现Cuda资源不足错误_Python_Cuda_Numba Pro - Fatal编程技术网

运行python numbapro时出现Cuda资源不足错误

运行python numbapro时出现Cuda资源不足错误,python,cuda,numba-pro,Python,Cuda,Numba Pro,我试图在numbapro python中运行一个cuda内核,但我一直遇到资源不足的错误。 然后,我尝试将内核执行到一个循环中,并发送较小的数组,但这仍然给了我相同的错误 以下是我的错误消息: Traceback (most recent call last): File "./predict.py", line 418, in <module> predict[griddim, blockdim, stream](callResult_d, catCount, numWords,

我试图在numbapro python中运行一个cuda内核,但我一直遇到资源不足的错误。 然后,我尝试将内核执行到一个循环中,并发送较小的数组,但这仍然给了我相同的错误

以下是我的错误消息:

Traceback (most recent call last):
File "./predict.py", line 418, in <module>
predict[griddim, blockdim, stream](callResult_d, catCount, numWords, counts_d, indptr_d, indices_d, probtcArray_d, priorC_d)
File "/home/mhagen/Developer/anaconda/lib/python2.7/site-packages/numba/cuda/compiler.py", line 228, in __call__
sharedmem=self.sharedmem)
File "/home/mhagen/Developer/anaconda/lib/python2.7/site-packages/numba/cuda/compiler.py", line 268, in _kernel_call
cu_func(*args)
File "/home/mhagen/Developer/anaconda/lib/python2.7/site-packages/numba/cuda/cudadrv/driver.py", line 1044, in __call__
self.sharedmem, streamhandle, args)
File "/home/mhagen/Developer/anaconda/lib/python2.7/site-packages/numba/cuda/cudadrv/driver.py", line 1088, in launch_kernel
None)
File "/home/mhagen/Developer/anaconda/lib/python2.7/site-packages/numba/cuda/cudadrv/driver.py", line 215, in safe_cuda_api_call
self._check_error(fname, retcode)
File "/home/mhagen/Developer/anaconda/lib/python2.7/site-packages/numba/cuda/cudadrv/driver.py", line 245, in _check_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: Call to cuLaunchKernel results in CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

我发现这是在cuda程序中

调用predict时,blockdim设置为1024。 预测[griddim、blockdim、stream](调用结果、catCount、NumWord、计数、索引、索引、ProbtArray、优先级)

但该过程是以1000个元素而不是1024个元素的切片大小进行迭代调用的。 因此,在该过程中,它将尝试写入返回数组中超出边界的24个元素

发送多个elements参数(n_el)并在cuda过程中放置错误检查调用可以解决此问题

@cuda.jit(argtypes=(double[:], int64, int64, int64, double[:], int64[:], int64[:], double[:,:], double[:] ))
def predict( callResult, n_el, catCount, wordCount, counts, indptr, indices, probtcArray, priorC ):

     i = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x

     if i < n_el:

         ....
@cuda.jit(argtypes=(双[:],int64,int64,int64,双[:],int64[:],int64[:],int64[:],int64[:],双[:],双[:]))
def predict(调用结果、n_el、catCount、wordCount、counts、indptr、索引、probtcArray、priorC):
i=cuda.threadIdx.x+cuda.blockIdx.x*cuda.blockDim.x
如果i
这不应该是向该部分产品的供应商提交的错误报告吗?
@cuda.jit(argtypes=(double[:], int64, int64, int64, double[:], int64[:], int64[:], double[:,:], double[:] ))
def predict( callResult, n_el, catCount, wordCount, counts, indptr, indices, probtcArray, priorC ):

     i = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x

     if i < n_el:

         ....