OpenCL矢量加法程序
我对OpenCL编程一无所知。我有一个OpenCL库和驱动程序的工作安装。但是我试图运行的程序没有产生预期的输出(输出都是零)。这只是一个简单的矢量加法程序。 提前感谢您的建议OpenCL矢量加法程序,opencl,Opencl,我对OpenCL编程一无所知。我有一个OpenCL库和驱动程序的工作安装。但是我试图运行的程序没有产生预期的输出(输出都是零)。这只是一个简单的矢量加法程序。 提前感谢您的建议 int main(int argc, char** argv) { cout << "Hello OpenCL" << endl; vector<Platform> all_platforms; int err = Platform::get(&all_platforms);
int main(int argc, char** argv)
{
cout << "Hello OpenCL" << endl;
vector<Platform> all_platforms;
int err = Platform::get(&all_platforms);
cout << "Getting Platform ... Error code " << err << endl;
if (all_platforms.size()==0)
(cout << "No platforms" << endl, exit(0));
cout << "Platform info : " << all_platforms[0].getInfo<CL_PLATFORM_NAME>() << endl;
Platform default_platform = all_platforms[0];
cout << "Defaulting to it ..." << endl;
vector<Device> all_devices;
err = default_platform.getDevices(CL_DEVICE_TYPE_GPU, &all_devices);
cout << "Getting devices ... Error code : " << err << endl;
if (all_devices.size()==0)
(cout << "No devices" << endl, exit(0));
Device default_device = all_devices[0];
cout << all_devices.size() << " devices & " << "Device info : " << all_devices[0].getInfo<CL_DEVICE_NAME>() << endl;
cout << "Defaulting to it ..." << endl;
Context context(default_device);
Program::Sources sources;
std::string kernel_code=
" void kernel simple_add(global const int* A, global const int* B, global int* C){"
" unsigned int i = get_global_id(0); "
" C[i]=A[i]+B[i]; "
" } ";
sources.push_back(make_pair(kernel_code.c_str(), kernel_code.length()+1));
Program program(context, sources);
if (program.build(all_devices)==CL_SUCCESS)
cout << "Built Successfully" << endl;
Buffer buffer_A(context,CL_MEM_READ_WRITE,sizeof(int)*10);
Buffer buffer_B(context,CL_MEM_READ_WRITE,sizeof(int)*10);
Buffer buffer_C(context,CL_MEM_READ_WRITE,sizeof(int)*10);
int A[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
int B[] = {0, 1, 2, 0, 1, 2, 0, 1, 2, 0};
CommandQueue queue(context,default_device);
queue.enqueueWriteBuffer(buffer_A,CL_TRUE,0,sizeof(int)*10,A); // load data from host to device
queue.enqueueWriteBuffer(buffer_B,CL_TRUE,0,sizeof(int)*10,B);
Kernel kernel(program, "vector_add");
kernel.setArg(0, buffer_A);
kernel.setArg(1, buffer_B);
kernel.setArg(2, buffer_C);
queue.enqueueNDRangeKernel(kernel,cl::NullRange,cl::NDRange(10),cl::NullRange);
queue.finish();
int *C = new int[10];
queue.enqueueReadBuffer(buffer_C, CL_TRUE, 0, 10 * sizeof(int), C);
for (int i=0;i<10;i++)
std::cout << A[i] << " + " << B[i] << " = " << C[i] << std::endl;
return 0;
}
int main(int argc,char**argv)
{
在注释中指出,当使用OpenCL API函数时,应该总是检查错误代码。这可以通过在C++包装器中启用异常处理来实现:
#define __CL_ENABLE_EXCEPTIONS // with cl.hpp
//#define CL_HPP_ENABLE_EXCEPTIONS // with cl2.hpp
#include <CL/cl.hpp>
int main(int argc, char *argv[])
{
try
{
// OpenCL code here
}
catch (cl::Error& err)
{
cout << err.what() << " failed with error code " << err.err() << endl;
}
}
最简单的修复方法是从build
函数中删除参数,因为默认情况下,它将为上下文中的所有设备生成程序(这几乎总是您实际需要的):
if(program.build()==CL\u SUCCESS)
CUT应该在每次调用时检查错误,或者为OpenCL启用C++异常。否则,您可能会错过任何返回错误的函数。CL::NDRange(10)必须无效。您应该至少做32、64、8192或8192个任意多个适合的CUL: CL::NDRange(64)任何全局大小都应该是有效的,不管内部执行计划如何。因此,如果您愿意,可以使用1, 3, 5、7, 13, 59。这不太可能是问题。好的答案是,实际上测试代码并提供所有反馈。真的,启用异常是C++所有OpenCL开发者应该马上做的一个关键特性。谢谢。哦,太多了
Context context(default_device);
// ...
if (program.build(all_devices)==CL_SUCCESS)
cout << "Built Successfully" << endl;
if (program.build()==CL_SUCCESS)
cout << "Built Successfully" << endl;