Compiler errors 使用nvcc执行OpenMPI代码时出错(OPAL错误)

Compiler errors 使用nvcc执行OpenMPI代码时出错(OPAL错误),compiler-errors,mpi,openmpi,nvcc,nvidia-jetson,Compiler Errors,Mpi,Openmpi,Nvcc,Nvidia Jetson,我正在尝试在NVIDIA Jetson TX2上运行OpenMPI代码。但是当我运行mpiexec时,我得到了一个OPAL错误 汇编说明: $ nvcc -I/home/user/.openmpi/include/ -L/home/user/.openmpi/lib/ -lmpi -std=c++11 *.cu *.cpp -o program nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are depr

我正在尝试在NVIDIA Jetson TX2上运行OpenMPI代码。但是当我运行
mpiexec
时,我得到了一个OPAL错误

汇编说明:

$ nvcc -I/home/user/.openmpi/include/ -L/home/user/.openmpi/lib/ -lmpi -std=c++11 *.cu *.cpp -o program
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
执行错误消息:

$ mpiexec -np 4 ./program 
[user:05728] OPAL ERROR: Not initialized in file pmix2x_client.c at line 109
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[user:05728] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[user:05729] OPAL ERROR: Not initialized in file pmix2x_client.c at line 109
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[user:05729] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[7361,1],0]
  Exit code:    1
--------------------------------------------------------------------------
我使用以下说明安装了OpenMPI 3.1.2版:

$ ./configure --prefix="/home/user/.openmpi" --with-cuda
$ make; sudo make install
我还根据本手册的说明相应地设置了我的
$PATH
$LD\u LIBRARY\u PATH
变量

我能够在笔记本电脑(英特尔i7)上成功执行该程序。在查找错误时,我发现一些链接建议我重新安装OpenMPI。我尝试过多次(包括新下载的库),但都没有成功

任何帮助都将不胜感激

编辑

我尝试按照评论中的要求运行以下最小代码(
main.cpp
):

#include <iostream>
#include "mpi.h"
#include <string>

int main(int argc, char *argv[]) {
  int rank, size;
  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  std::cout << rank << '\n';
  MPI_Finalize();
  return 0;
}
但是如果我用
mpic++
编译它,它就可以运行得非常好

$ mpic++ main.cpp -o ./program
$ mpiexec -np 4 ./program 
0
1
3
2

这是您安装的唯一OpenMPI版本吗?我猜您在构建和运行之间使用了不同的MPI版本。检查
哪个mpirun
,并搜索
mpirun
的实例。如果你在Ubuntu上,你会做什么

sudo updatedb
locate mpirun

如果调用正确的
mpirun
(与构建时使用的版本相同),则错误应该会消失。

您能
mpiexec-n4 hello\u c
吗?源代码位于
examples/hello_c.c
在MPI_INIT之前做什么?@MatthieuBrucher只声明等级和大小。我的主要功能如下:
intmain(intargc,char*argv[]){intrank,size;MPI_Init(&argc,&argv)
。我确实
#包含了一些其他
.cu
.cpp
文件,但我假设您只想要
MPI_Init()之前在main中发生的事情
call.@GillesGouaillardet查看编辑可能是nvcc没有链接到正确的.so?直接尝试/home/user/.openmpi/lib/libmpi.so
sudo updatedb
locate mpirun