C++ 使用推力：：reduce在没有溢出的情况下计算8位整数向量上的和_C++_Cuda_Thrust

C++ 使用推力：：reduce在没有溢出的情况下计算8位整数向量上的和

c++ cuda

C++ 使用推力：：reduce在没有溢出的情况下计算8位整数向量上的和,c++,cuda,thrust,C++,Cuda,Thrust,我有一个类型为uint8\u t的设备向量，如果可能的话，我想用推力：：减少计算它的和。问题是我得到了溢出，因为总和将远远大于255。我认为下面的代码将通过将结果存储为32位整数来计算总和，但事实似乎并非如此。有没有一个好的方法来实现这一点 uint8_t * flags_d; ... const int32_t N_CMP_BLOCKS = thrust::reduce( thrust::device_pointer_cast( flags_d ), thrust::dev

我有一个类型为

uint8\u t

的设备向量，如果可能的话，我想用

推力：：减少

计算它的和。问题是我得到了溢出，因为总和将远远大于255。我认为下面的代码将通过将结果存储为32位整数来计算总和，但事实似乎并非如此。有没有一个好的方法来实现这一点

uint8_t * flags_d;
...
const int32_t N_CMP_BLOCKS = thrust::reduce( 
    thrust::device_pointer_cast( flags_d ), 
    thrust::device_pointer_cast( flags_d ) + N,
    (int32_t) 0,
    thrust::plus<int32_t>() );

uint8*flags\d；
...
const int32 N\u CMP\u块=推力：：减少（
推力：：装置\u指针\u投射（标志\u d），
推力：设备指针投射（标志）+N，
（int32_t）0，
推力：：正（）；

我认为唯一可行的解决方案是使用

推力：：变换_reduce

在reduce中的累加操作之前，将8位输入数据显式转换为32位量。因此，我希望有这样的情况：

#include <thrust/transform_reduce.h>
#include <thrust/functional.h>
#include <thrust/execution_policy.h>

template<typename T1, typename T2>
struct char2int
{
  __host__ __device__ T2 operator()(const T1 &x) const
  {
    return static_cast<T2>(x);
  }
};

int main()
{
  unsigned char data[6] = {128, 100, 200, 102, 101, 123};
  int result = thrust::transform_reduce(thrust::host,
                                        data, data + 6,
                                        char2int<unsigned char,int>(),
                                        0,
                                        thrust::plus<int>());

  std::cout << "Result is " << result << std::endl;
 
  return 0;
}

#包括
#包括
#包括
模板
结构char2int
{
__主机\uuuuuuuuu设备\uuuut2运算符（）（常数T1和x）常数
{
返回静态_-cast（x）；
}
};
int main（）
{
无符号字符数据[6]={128100200102101123}；
int结果=推力：：转换\u减少（推力：：主机，
数据,数据+6,，
char2int（），
0,
推力：：正（）；
标准：：cout