在另一台计算机中加载共享库时发生MPI错误:libpgftnrtl.so

在另一台计算机中加载共享库时发生MPI错误:libpgftnrtl.so,mpi,openmpi,hpc,Mpi,Openmpi,Hpc,我内置了一台linux(ubuntu)机器(SOL),在这台机器上运行正常。只有SOL拥有所有编译器LIB ./mpiexec -n 8 -f machinefile /usr/apps/lib/mpich-3.3.2-pgf_gcc/examples/cpi Process 0 of 8 is on SOL Process 6 of 8 is on SOL Process 1 of 8 is on SOL Process 7 of 8 is on SOL Process 3 of 8 is o

我内置了一台linux(ubuntu)机器(SOL),在这台机器上运行正常。只有SOL拥有所有编译器LIB

./mpiexec -n 8 -f machinefile /usr/apps/lib/mpich-3.3.2-pgf_gcc/examples/cpi
Process 0 of 8 is on SOL
Process 6 of 8 is on SOL
Process 1 of 8 is on SOL
Process 7 of 8 is on SOL
Process 3 of 8 is on SOL
Process 2 of 8 is on SOL
Process 4 of 8 is on SOL
Process 5 of 8 is on SOL
pi is approximately 3.1415926544231247, Error is 0.0000000008333316
wall clock time = 0.001447
机器文件

SOL: 8
SOL: 4
Corona:4
然而,当我尝试在两台机器上运行MPI时 机器文件

SOL: 8
SOL: 4
Corona:4
Corona找不到所需的lib

stilocal@SOL:/usr/apps/lib/mpich-install/bin$ ./mpiexec -n 8 -f machinefile /usr/apps/lib/mpich-3.3.2-pgf_gcc/examples/cpi
/usr/apps/lib/mpich-3.3.2-pgf_gcc/examples/.libs/lt-cpi: error while loading shared libraries: libpgftnrtl.so: cannot open shared object file: No such file or directory
/usr/apps/lib/mpich-3.3.2-pgf_gcc/examples/.libs/lt-cpi: error while loading shared libraries: libpgftnrtl.so: cannot open shared object file: No such file or directory
/usr/apps/lib/mpich-3.3.2-pgf_gcc/examples/.libs/lt-cpi: error while loading shared libraries: libpgftnrtl.so: cannot open shared object file: No such file or directory
/usr/apps/lib/mpich-3.3.2-pgf_gcc/examples/.libs/lt-cpi: error while loading shared libraries: libpgftnrtl.so: cannot open shared object file: No such file or directory
^C[mpiexec@SOL] Sending Ctrl-C to processes as requested
[mpiexec@SOL] Press Ctrl-C again to force abort
[mpiexec@SOL] HYDU_sock_write (/usr/apps/lib/mpich-3.3.2/src/pm/hydra/utils/sock/sock.c:256): write error (Bad file descriptor)
[mpiexec@SOL] HYD_pmcd_pmiserv_send_signal (/usr/apps/lib/mpich-3.3.2/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:178): unable to write data to proxy
[mpiexec@SOL] ui_cmd_cb (/usr/apps/lib/mpich-3.3.2/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:77): unable to send signal downstream
[mpiexec@SOL] HYDT_dmxu_poll_wait_for_event (/usr/apps/lib/mpich-3.3.2/src/pm/hydra/tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@SOL] HYD_pmci_wait_for_completion (/usr/apps/lib/mpich-3.3.2/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:196): error waiting for event
[mpiexec@SOL] main (/usr/apps/lib/mpich-3.3.2/src/pm/hydra/ui/mpich/mpiexec.c:336): process manager error waiting for completion

如何解决此问题?

缺少的文件来自PGI编译器运行时。您需要将它安装在
Corona
的本地磁盘上,或者从该节点上的共享文件系统装载它
ssh-SOL-ldd/usr/apps/lib/mpich-3.3.2-pgf_-gcc/examples/cpi | grep-libpgftnrtl.so
将向您显示此文件在
SOL
上的位置。我想知道我应该静态编译mpi吗<代码>shell$./configure--不带内存管理器--不带libnuma \--启用静态[…其他配置参数…]否。但您可能想尝试链接静态PGI运行时(英特尔有一个标志,不知道PGI)