CUDA错误中存在大量全局变量

CUDA错误中存在大量全局变量,cuda,Cuda,接下来是20000多条类似线路 __device__ static char Tc0[] = {'0','\0'}; __device__ static char Tc1000[] = {'1','0','0','0','\0'}; __device__ static char Tc1000th[] = {'1','0','0','0','t','h','\0'}; __device__ static char Tc100[] = {'1','0','0','\0'}; __device__ s

接下来是20000多条类似线路

__device__ static char Tc0[] = {'0','\0'};
__device__ static char Tc1000[] = {'1','0','0','0','\0'};
__device__ static char Tc1000th[] = {'1','0','0','0','t','h','\0'};
__device__ static char Tc100[] = {'1','0','0','\0'};
__device__ static char Tc100th[] = {'1','0','0','t','h','\0'};
汇编:

__device__ static char Tczymolytic[] = {'z','y','m','o','l','y','t','i','c','\0'};
__device__ static char Tczymotic[] = {'z','y','m','o','t','i','c','\0'};

int main()
{
}
除了大量未使用变量的警告消息外,还出现以下错误:

nvcc ./test2.cu
CUDA使用恒定内存的目的是什么?有可能把它修好吗

按照@Talonmes的规定,使用以下编译命令,它可以工作:

ptxas error   : File uses too much global constant data (0x29e58 bytes, 0x10000 max)

这里的关键选项通常是-arch=sm_52。你所做的是合法的,应该有效

然而,在现已弃用的费米体系结构sm_20和sm_21上,汇编程序似乎会尝试将静态定义和初始化设备变量的初始化值填充到具有64kb大小限制的常量内存中。在较新的、受支持的体系结构上,这种情况不会发生

因为您使用的是CUDA 7.5,它的默认编译目标是sm_20,如果您没有指定一个体系结构,在该体系结构中汇编器将向全局内存发出静态设备声明,那么一旦这些符号的大小超过64kb,编译将失败

例如:

nvcc -w -std=c++11 -arch=sm_52 -cubin ./test2.cu

在这里,您可以看到编译只会对Compute2.x目标失败。为了实现更高的计算能力目标,汇编程序很高兴地发出800kb的静态全局内存符号。

您使用的CUDA版本是什么?nvcc-版本nvcc:NVIDIA R CUDA编译器驱动程序版权c 2005-2015 NVIDIA Corporation基于Tue_Aug_11_14:27:32_CDT_2015 CUDA编译工具,7.5版,7.5.17版
$ cat make_silly.py
for i in range(0,100000):
    print "__device__ static char tx%05d[] = {'0','1','2','3','5','6','7','8'};"%i

print ""
print "int main() { return 0; }"

$ python make_silly.py > make_silly.cu

$ tail -20 make_silly.cu
__device__ static char tx99982[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99983[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99984[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99985[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99986[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99987[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99988[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99989[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99990[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99991[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99992[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99993[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99994[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99995[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99996[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99997[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99998[] = {'0','1','2','3','5','6','7','8'};
__device__ static char tx99999[] = {'0','1','2','3','5','6','7','8'};

int main() { return 0; }

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17

$ nvcc -w -std=c++11 -arch=sm_30 -Xptxas="-v --disable-optimizer-constants" -cubin make_silly.cu 
ptxas info    : 800000 bytes gmem

$ nvcc -w -std=c++11 -arch=sm_20 -Xptxas="-v --disable-optimizer-constants" -cubin make_silly.cu 
ptxas error   : File uses too much global constant data (0xc3500 bytes, 0x10000 max)
ptxas info    : 800000 bytes gmem, 800000 bytes cmem[14]