如何使用推力和cuda将gpu数据分类到cpu对象拥有的单独列表中？_Cuda_Thrust

如何使用推力和cuda将gpu数据分类到cpu对象拥有的单独列表中？

cuda

如何使用推力和cuda将gpu数据分类到cpu对象拥有的单独列表中？,cuda,thrust,Cuda,Thrust,我是个新手，但我希望能在并行排序的场景中找到一个方法。我有一个超大的gpu列表（1mil+），我试图将它们分类到不同的cpu容器中，每个容器都有一个设备向量。我的想法是将gpu列表排序为CPU容器拥有的各种设备向量 class GpuObject { int someData; int otherValue; }; class CpuContainer { thrust::device_vector<GpuObject>* SortedGpuList; };

我是个新手，但我希望能在并行排序的场景中找到一个方法。我有一个超大的gpu列表（1mil+），我试图将它们分类到不同的cpu容器中，每个容器都有一个设备向量。我的想法是将gpu列表排序为CPU容器拥有的各种设备向量

class GpuObject
{
    int someData;
    int otherValue;
};

class CpuContainer
{
    thrust::device_vector<GpuObject>* SortedGpuList;
};

for( int i = 0; i<100; i++ )
{
      Containers.push_back(new CpuContainer());
}

thrust::device_vector<GpuObject>* completeGpuList;

__device__ __host__
void sortIntoContainers( .... )
{
    // ... possible to sort completeGpuList into Containers[i].SortedGpuList based on GpuObject.someData ?
}

类GpuObject
{
int-someData；
int-otherValue；
};
类容器
{
推力：设备矢量*分拣脉冲器；
};
对于（int i=0；i在一个较大的矩阵中设置所有向量如何？值存储在其中，其他字段在对象中排序。例如，一个50*1M浮点*矩阵。然后每个向量i位于该矩阵的偏移量“50*i”处，或者（矩阵+50*i）。这是管理多个向量的一种常用方法
然后，您可以使用“推力：：按键排序”对元素进行排序。每次排序之前，使用一个简单的内核将“键”矩阵重置为[0，1，…，49，0，1，…，49]。然后可以使用下面的“按索引排序列”对元素进行排序。排序后，键是对象的索引
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/sort.h>
#include <thrust/reduce.h>
#include <thrust/execution_policy.h>
#include <thrust/functional.h>



extern "C"
__global__ void sort_columns_withIndices(float* values, int* keys, int numRows, int numCols, int descending)
{
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < numCols)
{   
    if (descending > 0){
        thrust::sort_by_key(thrust::device, values + i * numRows, values + (i + 1) * numRows, keys + i * numRows, thrust::greater<float>());
    } else {
        thrust::sort_by_key(thrust::device, values + i * numRows, values + (i + 1) * numRows, keys + i * numRows, thrust::less<float>());   
    }
}
}

#包括
#包括
#包括
#包括
#包括
#包括
外部“C”
__全局\无效排序\列\带索引（浮点*值、int*键、int numRows、int numCols、int降序）
{
int i=blockDim.x*blockIdx.x+threadIdx.x；
如果（i0）{
推力：：按键排序（推力：：设备，值+i*numRows，值+（i+1）*numRows，键+i*numRows，推力：：更大（）；
}否则{
推力：：按键排序（推力：：设备，值+i*numRows，值+（i+1）*numRows，键+i*numRows，推力：：less（））；
}
}
}
是的，这基本上是一种压缩的稀疏行格式。CUSP库中内置了这个故事的GPU实现