Cuda 如何在设备中正确实现将向量返回给另一个设备函数的内联函数？_Cuda_Gpgpu_Thrust

Cuda 如何在设备中正确实现将向量返回给另一个设备函数的内联函数？

cuda

Cuda 如何在设备中正确实现将向量返回给另一个设备函数的内联函数？,cuda,gpgpu,thrust,Cuda,Gpgpu,Thrust,我想正确实现一个内联设备函数，该函数填充动态大小的向量并返回填充向量，如： __device__ inline thrust::device_vector<double> make_array(double zeta, int l) { thrust::device_vector<double> ret; int N =(int)(5*l+zeta); //the size of the array will depend on l and zeta, i

我想正确实现一个内联设备函数，该函数填充动态大小的向量并返回填充向量，如：

__device__  inline   thrust::device_vector<double> make_array(double zeta, int l)
{
  thrust::device_vector<double> ret;
  int N =(int)(5*l+zeta); //the size of the array  will depend on l and zeta, in a complex way...
  // Make sure of sufficient memory allocation
  ret.reserve(N);
  // Resize array
  ret.resize(N);
  //fill it:
  //for(int i=0;i<N;i++)
  // ...;
  return ret;
}

\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
{
推力：装置的推力；
int N=（int）（5*l+zeta）；//数组的大小将以复杂的方式取决于l和zeta。。。
//确保有足够的内存分配
净储备（N）；
//调整数组大小
重新调整大小（N）；
//填写：
//对于（inti=0；i推力：：设备向量是
但是，您可以返回指向的指针，如下所示：
#include <assert.h>

template <typename T>
__device__  T* make_array(T zeta, int l)
{
  int N =(int)(5*l+zeta); //the size of the array  will depend on l and zeta, in a complex way...
  T *ret = (T *)malloc(N*sizeof(T));
  assert(ret != NULL);  // error checking

  //fill it:
  //for(int i=0;i<N;i++)
  // ret[i] = ...;
  return ret;
}

#包括。
推力：：设备向量是
但是，您可以返回指向的指针，如下所示：
#include <assert.h>

template <typename T>
__device__  T* make_array(T zeta, int l)
{
  int N =(int)(5*l+zeta); //the size of the array  will depend on l and zeta, in a complex way...
  T *ret = (T *)malloc(N*sizeof(T));
  assert(ret != NULL);  // error checking

  //fill it:
  //for(int i=0;i<N;i++)
  // ret[i] = ...;
  return ret;
}

#include.
一旦计算就绪，是否可以从另一个设备函数中释放为数组ret分配的内存？是的，只要您在使用时适当小心，就可以在内核中使用free
。这在中介绍。非常感谢您的建议。是否可以释放为数组分配的内存，ret，在计算完成后从另一个设备函数返回？是的，只要您在使用时适当小心，您就可以在内核免费
中使用。这将在中介绍。非常感谢您的建议。