运行MPI的倍频程失败

运行MPI的倍频程失败,mpi,cluster-computing,octave,Mpi,Cluster Computing,Octave,我是八度音阶的新手。现在,我在Ubuntu14.04上运行了一个helloworld的octave示例,但它总是失败。故障信息如下: octave:1> system (" mpirun -x LD_PRELOAD=libmpi.so --hostfile ./hostfile -np 2 octave -q --eval 'pkg load mpi; helloworld ()'"); --------------------------------------------------

我是八度音阶的新手。现在,我在Ubuntu14.04上运行了一个helloworld的octave示例,但它总是失败。故障信息如下:

octave:1> system (" mpirun -x LD_PRELOAD=libmpi.so --hostfile ./hostfile -np 2 octave -q --eval 'pkg load mpi; helloworld ()'");

--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

     ompi_mpi_init: ompi_rte_init failed
     --> Returned "(null)" (-43) instead of "Success" (0)
-------------------------------------------------------------------------- 

An error occurred in MPI_Init
on a NULL communicator
MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
   and potentially your MPI job)
[computationnode:17991] Local abort before MPI_INIT completed completed 
   successfully, but am not able to aggregate error messages, and not able to 
   guarantee that all other processes were killed!

An error occurred in MPI_Init
on a NULL communicator
MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
   and potentially your MPI job)
[computationnode:17992] Local abort before MPI_INIT completed completed
   successfully, but am not able to aggregate error messages, and not able to 
   guarantee that all other processes were killed!

--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

     ompi_mpi_init: ompi_rte_init failed
     --> Returned "(null)" (-43) instead of "Success" (0)
--------------------------------------------------------------------------
mpirun
注意到作业已中止,但没有关于进程的信息 这造成了这种局面


有人能帮我吗?我非常想解决这个问题。

您确定您的主机文件吗?您是否尝试过不使用
--hostfile
选项?您是否尝试过在执行
mpirun
命令之前导出LD\u预加载,并将
-x LD\u预加载作为一个选项传递?(mpirun的手册页指出,
-x
不是很复杂,最好在命令之外定义env变量,并且仅使用
-x
来“导出”,而不是为了mpi而“定义”)。另外,您是否确定
helloworld()
函数存在,并且可以从默认路径中的倍频程访问该函数?(我不是mpi专家,这只是一般性的建议)此外,您可能会更幸运地使用
并行
软件包,它似乎更易于维护/记录。特别是如果你只是在寻找局部并行化的话。是的,这个问题已经解决了。这主要是因为环境变量。谢谢你,伙计。顺便说一句,我是MPI新手。就主机文件而言,它应该是什么格式和信息?你能给我举几个例子吗?例如,我有两个节点,一个有4个核,IP是A.A.A.A,另一个也有4个核,IP是B.B.B.B。我不知道,就像我说的,我不是MPI专家。我只是根据常识对你的问题可能的罪魁祸首发表了我的看法:)