如何从单个分配启动不同的MPI作业_Mpi_Slurm

如何从单个分配启动不同的MPI作业

mpi

如何从单个分配启动不同的MPI作业,mpi,slurm,Mpi,Slurm,假设我在16个节点上启动了一个256核的MPI作业我有一个MPI程序，但不幸的是它不是在一个参数上并行的。幸运的是，我可以轻松地创建自己的MPI程序，只有在获得输出文件的情况下，它才能处理该参数的并行化那么，如何启动一个MPI作业（从MPI作业中），它使用这些核心的特定子集，即仅使用特定节点？所以基本上我想在一个256核的MPI作业中运行16个不同的MPI计算，所有计算都有16个核。这些计算大约需要10分钟，有16个核，在外循环中大约有200次迭代。有256个内核，这是一个合理的32小时。重

假设我在16个节点上启动了一个256核的MPI作业

我有一个MPI程序，但不幸的是它不是在一个参数上并行的。幸运的是，我可以轻松地创建自己的MPI程序，只有在获得输出文件的情况下，它才能处理该参数的并行化

那么，如何启动一个MPI作业（从MPI作业中），它使用这些核心的特定子集，即仅使用特定节点？所以基本上我想在一个256核的MPI作业中运行16个不同的MPI计算，所有计算都有16个核。这些计算大约需要10分钟，有16个核，在外循环中大约有200次迭代。有256个内核，这是一个合理的32小时。重新提交200次或按顺序运行这16次计算都是不合理的

更准确地说，下面是一些python伪代码，用于我想做的事情：

from ase.parallel import world, rank
from os import system, chdir
while 1:
    node = rank // 16
    subrank = rank % 16
    chdir(mydir+"Calculation_%d" % node)
    # This will not work, one needs to specify somehow that only ranks from node*16 to node*16+15 will be used
    os.system("mpirun -n 16 nwchem input.nw > nwchem.out") 
    analyse_output(mydir+"Calculation_%d/nwchem.out" % node)
    rewrite_input_files()

基本上，有16个内核和4个作业：

rank 0: start nwchem process in /calculation0/ as rank 0/4.
rank 1: start nwchem in /calculation0/ as rank 1/4.
rank 2: start nwchem in /calculation0/ as rank 2/4.
rank 3: start nwchem in /calculation0/ as rank 3/4.
rank 4: start nwchem in /calculation1/ as rank 0/4.
rank 5: start nwchem in /calculation1/ as rank 1/4.
rank 6: start nwchem in /calculation1/ as rank 2/4.
rank 7: start nwchem in /calculation1/ as rank 3/4.
rank 8: start nwchem in /calculation2/ as rank 0/4.
rank 9: start nwchem in /calculation2/ as rank 1/4.
rank 10: start nwchem in /calculation2/ as rank 2/4.
rank 11: start nwchem in /calculation2/ as rank 3/4.
rank 12: start nwchem in /calculation3/ as rank 0/4.
rank 13: start nwchem in /calculation3/ as rank 1/4.
rank 14: start nwchem in /calculation3/ as rank 2/4.
rank 15: start nwchem in /calculation3/ as rank 3/4.

Gather all the results.
Optimize all geometries (this requires knowledge of forces between the calculations).
Repeat until convergence (about 200 times).

背景：如果你感兴趣，我会在这里详细说明。但主要的问题仍然是“如何从M个核的单个MPI计算中实例化N个MPI计算，每个核都有M/N核

NWChem没有图像平行推动弹性带计算器。以下是此过程的示例，代码不同：GPAW。

这里很顺利，因为使用GPAW和MPI接口创建子通信器非常容易。但是，我只有nwchem运行时MPI，我希望做同样的事情：创建许多计算器（一个带或一个几何链，都与“弹簧”链接，并优化该链）

我想，您正在尝试在MPI中使用动态流程管理。我们可以为较小的作业生成新流程，然后在计算之后，我们可以连接生成的现有流程

请看，共识是“不，他们不应该"!我最初认为这个问题会像您一样容易理解，所以我尝试用伪代码详细说明。简短版本：这些计算的结果是相互关联的，是更大的外部优化循环的一部分。更精确的图像并行NEB计算。我在争取时间，而不是资源。所以，对不起，但这真的不是一个选项。我必须更深入地了解mpirun是如何生成其进程并将其固定到核心和节点的。只要测试一下。。。