C++ GCC OpenMP目标ptxas错误“;指令';的参数0需要标签;呼叫'&引用;当使用本征

C++ GCC OpenMP目标ptxas错误“;指令';的参数0需要标签;呼叫'&引用;当使用本征,c++,gcc,gpu,openmp,eigen3,C++,Gcc,Gpu,Openmp,Eigen3,我正在尝试重写一个与OpenMP并行的算法,以测试目标accel。OMP的设备功能。在OMP构造中使用Eigen(3.4 rc1)时,我遇到了以下问题(参见示例): 最小示例 #include <iostream> #include <Eigen/Eigen> #include <cmath> using Eigen::MatrixXd; int main() { int n = 100000000; double total = 0;

我正在尝试重写一个与OpenMP并行的算法,以测试目标accel。OMP的设备功能。在OMP构造中使用Eigen(3.4 rc1)时,我遇到了以下问题(参见示例):

最小示例

#include <iostream>
#include <Eigen/Eigen>
#include <cmath>

using Eigen::MatrixXd;

int main() {
    int n = 100000000;
    double total = 0;
    MatrixXd m(1,1);
    m(0,0) = 1;

   #pragma omp target teams distribute\
    parallel for map(tofrom: total) map(to: n, m) reduction(+:total)
    for (int i = 0; i < n; ++i) {
        total +=m(0,0)* exp(sin(M_PI * (double) i/12345.6789));
    }
        std::cout << "total is " << total << '\n';
}
明显错误(有关完整的详细输出,请参见下文)

我还没有发现Eigen与OMP目标指令一起工作的明确确认,然而,它显然应该工作。 这个错误不是很有帮助(或者至少我无法从中获得洞察力),而是移动了init。将矩阵对象导入for循环会产生一个附加错误:

ptxas /tmp/ccl3LNcx.o, line 277; error   : Label expected for argument 0 of instruction 'call'
ptxas /tmp/ccl3LNcx.o, line 277; error   : Function '_ZN5Eigen6MatrixIdLin1ELin1ELi0ELin1ELin1EEC1IiiEERKT_RKT0_' not declared in this scope
ptxas /tmp/ccl3LNcx.o, line 277; fatal   : Call target not recognized
因此,我的猜测是,不知何故,目标设备的编译器在循环中看不到库?无论是将本征路径添加到
-foffload
,还是(我唯一偶然发现的事情之一)添加
-fno exceptions
标志,都没有改变任何事情

谢谢你抽出时间


完整(详细)错误输出

Using built-in specs.
COLLECT_GCC=x86_64-linux-gnu-accel-nvptx-none-gcc-9
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/accel/nvptx-none/lto-wrapper
Target: nvptx-none
Configured with: ../src/configure --prefix=/usr --libexecdir=/usr/lib --with-gcc-major-version-only --disable-bootstrap --disable-sjlj-exceptions --enable-newlib-io-long-long --target nvptx-none --enable-as-accelerator-for=x86_64-linux-gnu --enable-languages=c,c++,fortran,lto --enable-checking=release --with-system-zlib --without-isl --program-prefix=nvptx-none- --program-suffix=-9
Thread model: single
gcc version 9.3.0 (GCC) 
COLLECT_GCC_OPTIONS='-m64' '-mgomp' '-fno-openacc' '-fPIC' '-foffload-abi=lp64' '-fopenmp' '-fcf-protection=none' '-v' '-v' '-o' '/tmp/ccDYQmTY.mkoffload'
 /usr/lib/gcc/x86_64-linux-gnu/9/accel/nvptx-none/lto1 -quiet -dumpbase ccYsZMdw.o -m64 -mgomp -auxbase ccYsZMdw -version -fno-openacc -fPIC -foffload-abi=lp64 -fopenmp -fcf-protection=none @/tmp/ccllzqV5 -o /tmp/ccpHq0W6.s
GNU GIMPLE (GCC) version 9.3.0 (nvptx-none)
    compiled by GNU C version 9.3.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version none
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU GIMPLE (GCC) version 9.3.0 (nvptx-none)
    compiled by GNU C version 9.3.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version none
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COLLECT_GCC_OPTIONS='-m64' '-mgomp' '-fno-openacc' '-fPIC' '-foffload-abi=lp64' '-fopenmp' '-fcf-protection=none' '-v' '-v' '-o' '/tmp/ccDYQmTY.mkoffload'
 /usr/lib/gcc/x86_64-linux-gnu/9/accel/nvptx-none/as -o /tmp/ccRXK6K4.o /tmp/ccpHq0W6.s
ptxas /tmp/ccRXK6K4.o, line 264; error   : Label expected for argument 0 of instruction 'call'
ptxas /tmp/ccRXK6K4.o, line 264; fatal   : Call target not recognized
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-linux-gnu-accel-nvptx-none-gcc-9 returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /usr/lib/gcc/x86_64-linux-gnu/9//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

更新1:从指针访问数据会引发
libgomp:cuCtxSynchronize错误:遇到非法内存访问
错误

int main() {
    int n = 10;
    double total = 0;
    MatrixXd m(1,1);
    m(0,0) = 1;

    double* array = m.data();
    std::cout << "array: " << array[0] <<std::endl; //this works

   #pragma omp target teams distribute\
    parallel for map(tofrom: total) map(to: n, array) reduction(+:total)
    for (int i = 0; i < n; ++i) {
        total +=array[0]; //this trows an error
    }
        std::cout << "total is " << total << '\n';
}
intmain(){
int n=10;
双倍合计=0;
矩阵m(1,1);
m(0,0)=1;
double*array=m.data();

OpenMP的std::cout AFAIK
map
仅适用于低级数组(C数组和指向内存区域的指针)因此,我认为您需要使用指针,而不是C++矩阵。注意,您可能可以将特征矩阵解开到OpenMP设备映射的行指针中,并使用特征本作为映射数组,将指针再次封装在循环中。所有这些看起来都很麻烦,但OpenMP不知道特征T。类型是矩阵。事实上,这与…嘿!谢谢你提供的信息中的问题相同。如果我理解正确,我应该能够通过map指令传递指向数据的指针,然后从指针重新生成循环中的特征矩阵?因为如果这是你的意思,如果我调用特征对象/func在初始化(例如初始化一个新矩阵)时,我得到了与上面描述的“未在此范围内声明”相同的错误这是我的意思,但是要小心:我并不是要创建一个完整的新矩阵对象,而是一个视图。我猜想视图应该是有效的。记住,环的内部在GPU上执行,因此比通常的C++有更大的限制。或者,问题可能是由于GCC的一些限制。GCC-10?我尝试了GCC 9.3和10.2。矩阵初始化和从上一个矩阵的数据指针构建映射对象(或者你指的是另一种视图技术)抛出上述未在此范围内声明的
函数“blabla”错误。此外,这可能与我对map子句中的指针处理缺乏经验有关,我在处理后尝试访问数据时遇到
libgomp:cuCtxSynchronize错误:非法内存访问
指向gpu的指针(但它在omp指令之前工作)。可能是我做错了(请参见编辑)?
Using built-in specs.
COLLECT_GCC=x86_64-linux-gnu-accel-nvptx-none-gcc-9
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/accel/nvptx-none/lto-wrapper
Target: nvptx-none
Configured with: ../src/configure --prefix=/usr --libexecdir=/usr/lib --with-gcc-major-version-only --disable-bootstrap --disable-sjlj-exceptions --enable-newlib-io-long-long --target nvptx-none --enable-as-accelerator-for=x86_64-linux-gnu --enable-languages=c,c++,fortran,lto --enable-checking=release --with-system-zlib --without-isl --program-prefix=nvptx-none- --program-suffix=-9
Thread model: single
gcc version 9.3.0 (GCC) 
COLLECT_GCC_OPTIONS='-m64' '-mgomp' '-fno-openacc' '-fPIC' '-foffload-abi=lp64' '-fopenmp' '-fcf-protection=none' '-v' '-v' '-o' '/tmp/ccDYQmTY.mkoffload'
 /usr/lib/gcc/x86_64-linux-gnu/9/accel/nvptx-none/lto1 -quiet -dumpbase ccYsZMdw.o -m64 -mgomp -auxbase ccYsZMdw -version -fno-openacc -fPIC -foffload-abi=lp64 -fopenmp -fcf-protection=none @/tmp/ccllzqV5 -o /tmp/ccpHq0W6.s
GNU GIMPLE (GCC) version 9.3.0 (nvptx-none)
    compiled by GNU C version 9.3.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version none
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU GIMPLE (GCC) version 9.3.0 (nvptx-none)
    compiled by GNU C version 9.3.0, GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version none
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COLLECT_GCC_OPTIONS='-m64' '-mgomp' '-fno-openacc' '-fPIC' '-foffload-abi=lp64' '-fopenmp' '-fcf-protection=none' '-v' '-v' '-o' '/tmp/ccDYQmTY.mkoffload'
 /usr/lib/gcc/x86_64-linux-gnu/9/accel/nvptx-none/as -o /tmp/ccRXK6K4.o /tmp/ccpHq0W6.s
ptxas /tmp/ccRXK6K4.o, line 264; error   : Label expected for argument 0 of instruction 'call'
ptxas /tmp/ccRXK6K4.o, line 264; fatal   : Call target not recognized
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-linux-gnu-accel-nvptx-none-gcc-9 returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /usr/lib/gcc/x86_64-linux-gnu/9//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
int main() {
    int n = 10;
    double total = 0;
    MatrixXd m(1,1);
    m(0,0) = 1;

    double* array = m.data();
    std::cout << "array: " << array[0] <<std::endl; //this works

   #pragma omp target teams distribute\
    parallel for map(tofrom: total) map(to: n, array) reduction(+:total)
    for (int i = 0; i < n; ++i) {
        total +=array[0]; //this trows an error
    }
        std::cout << "total is " << total << '\n';
}