Parallel processing restrict(amp)比CUDA内核代码更严格吗? 在C++的AMP中,内核函数或lambdas用限制(AMP)标记,这对C++的允许子集()有严格的限制。CUDA允许在内核函数中C或C++子集上有更多的自由吗?< P> VisualStudio 11和CUDA 4.1,限制(AMP)< /C>函数比CUDA的类似最值得注意的是,AMP对指针的使用方式限制更大。这是AMP的DirectX11计算基底的自然结果,它禁止(图形着色器)代码中的指针。相比之下,CUDA的低级别IR是,这比HLSL更通用

Parallel processing restrict(amp)比CUDA内核代码更严格吗? 在C++的AMP中,内核函数或lambdas用限制(AMP)标记,这对C++的允许子集()有严格的限制。CUDA允许在内核函数中C或C++子集上有更多的自由吗?< P> VisualStudio 11和CUDA 4.1,限制(AMP)< /C>函数比CUDA的类似最值得注意的是,AMP对指针的使用方式限制更大。这是AMP的DirectX11计算基底的自然结果,它禁止(图形着色器)代码中的指针。相比之下,CUDA的低级别IR是,这比HLSL更通用,parallel-processing,cuda,gpu,c++-amp,Parallel Processing,Cuda,Gpu,C++ Amp,下面是逐行比较: | VS 11 AMP restrict(amp) functions | CUDA 4.1 sm_2x __device__ functions | |------------------------------------------------------------------------------| |* can only call functions that have |* can only call functions that have

下面是逐行比较:

| VS 11 AMP restrict(amp) functions     | CUDA 4.1 sm_2x __device__ functions  |
|------------------------------------------------------------------------------|
|* can only call functions that have    |* can only call functions that have   |
|  the restrict(amp) clause             |  the __device__ decoration           |
|* The function must be inlinable       |* need not be inlined                 |
|* The function can declare only        |* Class types are allowed             |
|  POD variables                        |                                      |
|* Lambda functions cannot              |* Lambdas are not supported, but      |
|  capture by reference and             |  user functors can hold pointers     |
|  cannot capture pointers              |                                      |
|* References and single-indirection    |* References and multiple-indirection |
|  pointers are supported only as       |  pointers are supported              |
|  local variables and function         |                                      |
|* No recursion                         |* Recursion OK                        |
|* No volatile variables                |* Volatile variables OK               |
|* No virtual functions                 |* Virtual functions OK                |
|* No pointers to functions             |* Pointers to functions OK            |
|* No pointers to member functions      |* Pointers to member functions OK     |
|* No pointers in structures            |* Pointers in structures OK           |
|* No pointers to pointers              |* Pointers to pointers OK             |
|* No goto statements                   |* goto statements OK                  |
|* No labeled statements                |* Labeled statements OK               |
|* No try, catch, or throw statements   |* No try, catch, or throw statements  |
|* No global variables                  |* Global __device__ variables OK      |
|* Static variables through tile_static |* Static variables through __shared__ |
|* No dynamic_cast                      |* No dynamic_cast                     |
|* No typeid operator                   |* No typeid operator                  |
|* No asm declarations                  |* asm declarations (inline PTX) OK    |
|* No varargs                           |* No varargs                          |

您可以阅读有关
限制(amp)
限制的更多信息。您可以在CUDA <代码>中引用C++支持,在

附录D中的函数可能与以下问题有关。好问题,尽管我担心它不是真正的可比性(也许迁移到程序员.SE?):nvcc根本不支持C++11,所以当谈到lambdas时,你显然没有走得很远!另一方面,AMP有完全不同的限制,首先是微软;这(或者更准确地说,目前缺乏非DirectX实现)使得它完全无法用于许多应用,例如科学应用。“但我想你是说只有语言限制吗?”@leftaroundabout:是的,我只是说语言限制,我可以留在C++03中。我提到LAMBDAS,只是因为它是用C++ AMP.IIRC启动内核代码的规定机制。这里讨论的是C++ AMP可能启用但没有的功能,有时基于明确的选择来鼓励并行计算中的良好实践: