Python 在Google Coral开发板上使用OpenCL和OpenCV时出现工作组大小错误_Python_Opencv_Gpu_Opencl_Google Coral

Python 在Google Coral开发板上使用OpenCL和OpenCV时出现工作组大小错误

python opencv opencl

Python 在Google Coral开发板上使用OpenCL和OpenCV时出现工作组大小错误,python,opencv,gpu,opencl,google-coral,Python,Opencv,Gpu,Opencl,Google Coral,我试图在Coral开发板上使用OpenCL加速和OpenCV。在UMat对象上使用cv2.normalize（）函数时出现以下错误： OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('minmaxloc', dims=1, globalsize=1024x1x1, localsize=1024x1x1) sync=true 此外，任何涉及UMAT的任务都运行得非常慢，CPU似乎比

我试图在Coral开发板上使用OpenCL加速和OpenCV。在UMat对象上使用cv2.normalize（）函数时出现以下错误：

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('minmaxloc', dims=1, globalsize=1024x1x1, localsize=1024x1x1) sync=true

此外，任何涉及UMAT的任务都运行得非常慢，CPU似乎比它应该做的工作更努力，所以我不确定GPU的任何加速是否正常工作

我通过Pip（

python3-m Pip安装OpenCV contrib Python

）并运行

cv2为Python 3.7安装了OpenCV 4.5.1。getBuildInformation（）

提供了有关OpenCL的以下信息：

OpenCL:               YES (no extra features)
Include path:         /tmp/pip-req-build-qmcu8eer/opencv/3rdparty/include/opencl/1.2

运行

clinfo

可以得到以下信息：

  Platform Name                                   Vivante OpenCL Platform
  Number of devices                                 1
  Device Name                                     Vivante OpenCL Device GC7000L.6214.0000
  Device Vendor                                   Vivante Corporation
  Device Vendor ID                                0x564956
  Device Version                                  OpenCL 1.2 
  Driver Version                                  OpenCL 1.2 V6.4.2.256507
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Max clock frequency                             800MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     (n/a)
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             1024
  === CL_PROGRAM_BUILD_LOG ===
  (6:0) : error : syntax error at 'kernel'
  Preferred work group size multiple              <getWGsizes:1200: create kernel : error -45>
  Preferred / native vector sizes                 
    char                                                 4 / 4       
    short                                                4 / 4       
    int                                                  4 / 4       
    long                                                 4 / 4       
    half                                                 0 / 0        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               0 / 0        (n/a)
Half-precision Floating-point support           <printDeviceInfo:68: get  CL_DEVICE_HALF_FP_CONFIG : error -30>
Single-precision Floating-point support         (core)
  Denormals                                     No
  Infinity and NANs                             Yes
  Round to nearest                              Yes
  Round to zero                                 Yes
  Round to infinity                             No
  IEEE754-2008 fused multiply-add               No
  Support is emulated in software               No
  Correctly-rounded divide and sqrt operations  No
Double-precision Floating-point support         (n/a)
Address bits                                    32, Little-Endian
Global memory size                              268435456 (256MiB)
Error Correction support                        Yes
Max memory allocation                           134217728 (128MiB)
Unified memory for Host and Device              Yes
Minimum alignment for any data type             128 bytes
Alignment of base address                       2048 bits (256 bytes)
Global Memory cache type                        Read/Write
Global Memory cache size                        8192 (8KiB)
Global Memory cache line size                   64 bytes
Image support                                   Yes
  Max number of samplers per kernel             16
  Max size for 1D images from buffer            65536 pixels
  Max 1D or 2D image array size                 8192 images
  Max 2D image size                             8192x8192 pixels
  Max 3D image size                             8192x8192x8192 pixels
  Max number of read image args                 128
  Max number of write image args                8
Local memory type                               Global
Local memory size                               32768 (32KiB)
Max number of constant args                     9
Max constant buffer size                        65536 (64KiB)
Max size of kernel argument                     1024
Queue properties                                
  Out-of-order execution                        Yes
  Profiling                                     Yes
Prefer user sync for interop                    Yes
Profiling timer resolution                      1000ns
Execution capabilities                          
  Run OpenCL kernels                            Yes
  Run native kernels                            No
printf() buffer size                            1048576 (1024KiB)
Built-in kernels                                (n/a)
Device Extensions                               cl_khr_byte_addressable_store cl_khr_gl_sharing cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics

平台名称Vivante OpenCL平台
设备数量1
设备名称Vivante OpenCL设备GC7000L.6214.0000
设备供应商维万特公司
设备供应商ID 0x564956
设备版本OpenCL 1.2
驱动程序版本OpenCL 1.2 V6.4.2.256507
设备OpenCL C版本OpenCL C 1.2
设备类型GPU
设备配置文件完整配置文件
设备可用是
编译器可用是
链接器可用是
最大计算单位1
最大时钟频率800MHz
设备分区（核心）
子设备的最大数量0
支持的分区类型（不适用）
支持的关联域（不适用）
最大工作项维度3
最大工作项大小1024x1024x1024
最大工作组大小1024
==CL\U程序\U构建\U日志===
（6:0）：错误：“内核”处的语法错误
首选工作组大小为多个
首选/本机向量大小
字符4/4
短4/4
int 4/4
长4/4
0/0的一半（cl_khr_fp16）
浮动4/4
双0/0（不适用）
半精度浮点支持
单精度浮点支持（核心）
非规范
无限和南斯是的
四舍五入到最近的“是”
四舍五入到零是的
四舍五入到无穷大不
IEEE754-2008融合乘法加法编号
支持是在软件中模拟的
正确四舍五入的除法和sqrt操作编号
双精度浮点支持（不适用）
地址位32，小端
全局内存大小268435456（256MB）
错误更正支持是
最大内存分配134217728（128MiB）
主机和设备的统一内存是
任何数据类型128字节的最小对齐方式
基址2048位（256字节）的对齐
全局内存缓存类型读/写
全局内存缓存大小8192（8KiB）
全局内存缓存线大小64字节
图像支持是的
每个内核的最大采样器数16
缓冲区中1D图像的最大大小65536像素
最大1D或2D图像阵列大小8192个图像
最大二维图像大小8192x8192像素
最大三维图像大小8192x8192x8192像素
最大读取图像参数数128
最大写入映像参数数8
本地存储器类型全局
本地内存大小32768（32KiB）
常量参数的最大数目9
最大恒定缓冲区大小65536（64KiB）
内核参数的最大大小1024
队列属性
无序执行是的
是的
希望用户同步进行互操作是
分析计时器分辨率1000ns
执行能力
运行OpenCL内核是的
不运行本机内核
printf（）缓冲区大小1048576（1024KB）
内置内核（不适用）
设备扩展cl_khr_字节可寻址cl_khr_gl_共享cl_khr_fp16 cl_khr_全局int32_基原子cl_khr_全局int32_扩展原子cl_khr_局部32_基原子cl_khr_局部32_扩展原子

我没有从源代码或任何东西构建OpenCL。。。任何未随开发板映像提供的OpenCL软件包，我都会在准备安装OpenCV时通过apt安装。我在这里有点不知所措--任何建议都很感激