当消息确实已发送时,为什么MPI_Iprobe返回false?

当消息确实已发送时,为什么MPI_Iprobe返回false?,mpi,Mpi,我想使用MPI_Iprobe测试带有给定标记的消息是否已经挂起 然而,MPI_Iprobe的行为并不像我预期的那样。 在下面的示例中,我将消息从多个任务发送到单个任务(级别0)。然后在等级0上,我等待几秒钟,以便有足够的时间完成MPI任务。然后,当我运行MPI_Iprobe时,它返回标志false。如果在(阻塞)MPI_探测之后重复,则返回true #include "mpi.h" #include <stdio.h> #include <unistd.h> int m

我想使用MPI_Iprobe测试带有给定标记的消息是否已经挂起

然而,MPI_Iprobe的行为并不像我预期的那样。 在下面的示例中,我将消息从多个任务发送到单个任务(级别0)。然后在等级0上,我等待几秒钟,以便有足够的时间完成MPI任务。然后,当我运行MPI_Iprobe时,它返回标志false。如果在(阻塞)MPI_探测之后重复,则返回true

#include "mpi.h"
#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
  int rank;
  int numprocs;
  int tag;
  int receive_tag;
  int flag=0;
  int number;
  int recv_number=0;

  MPI_Request request;
  MPI_Status status;

  MPI_Init(&argc,&argv);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  MPI_Comm_size(MPI_COMM_WORLD,&numprocs);

  // rank 0 receives messages, all others send messages
  if (rank > 0 ) {
    number = rank;
    tag = rank;
    MPI_Isend(&number, 1, MPI_INT, 0, tag, MPI_COMM_WORLD,&request); // send to rank 0
    printf("Sending tag : %d \n",tag);
   } 
   else if (rank == 0) {

   sleep(5); // [seconds] allow plenty of time for all sends from other tasks to complete

   receive_tag = 3; // just try and receive a single message from task 1

   MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
   printf("After MPI_Iprobe, flag = %d \n",flag);

   MPI_Probe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&status);
   printf("After MPI_Probe, found message with tag : %d \n",receive_tag);

   MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
   printf("After second MPI_Iprobe, flag = %d \n",flag);

   // receive all the messages
   for (int i=1;i<numprocs;i++){    
     MPI_Recv(&recv_number, 1, MPI_INT, MPI_ANY_SOURCE, i, MPI_COMM_WORLD,&status);
     printf("Received : %d \n",recv_number);
   }

 }
 MPI_Finalize();
}
为什么mpi_iprobe第一次返回“false”

任何帮助都将不胜感激


编辑:在赫里斯托·伊利耶夫的回答之后,我现在有以下代码:

#include "mpi.h"
#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
  int rank;
  int numprocs;
  int tag;
  int receive_tag;
  int flag=0;
  int number;
  int recv_number=0;

  MPI_Request request;
  MPI_Status status;

  MPI_Init(&argc,&argv);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  MPI_Comm_size(MPI_COMM_WORLD,&numprocs);

  // rank 0 receives messages, all others send messages
  if (rank > 0 ) {
    number = rank;
    tag = rank;

    MPI_Isend(&number, 1, MPI_INT, 0, tag, MPI_COMM_WORLD,&request); // send to rank 0
    printf("Sending tag : %d \n",tag);

    // do stuff

    MPI_Wait(&request,&status);
    printf("Sent tag : %d \n",tag);

   }
    else if (rank == 0) {

    sleep(5); // [seconds] allow plenty of time for all sends from other tasks to complete

    receive_tag = 3; // just try and receive a single message from task 1

    MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
    printf("After MPI_Iprobe, flag = %d \n",flag);

    MPI_Probe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&status);
    printf("After MPI_Probe, found message with tag : %d \n",receive_tag);

    MPI_Iprobe(MPI_ANY_SOURCE,receive_tag,MPI_COMM_WORLD,&flag,&status);
    printf("After second MPI_Iprobe, flag = %d \n",flag);

    // receive all the other messages
    for (int i=1;i<numprocs;i++){   
       MPI_Recv(&recv_number, 1, MPI_INT, MPI_ANY_SOURCE, i, MPI_COMM_WORLD,&status);
    }

 }
 MPI_Finalize();
}

您正在使用
MPI\u Isend
发送消息
MPI_Isend
启动异步(后台)数据传输。除非对请求进行了
MPI\u Wait*
MPI\u Test*
调用,否则实际数据传输可能不会发生。某些MPI实现具有(或可以配置为)后台进程线程,即使未对请求执行等待/测试,这些线程也会对发送操作进行进程处理,但不应依赖于此类行为

只需将
MPI_Isend
替换为
MPI_Send
或添加
MPI_Wait(&request)(请注意,
MPI\u Isend
+
MPI\u Wait
紧跟其后相当于
MPI\u Send


MPI_Iprobe
用于繁忙等待,即:

while (condition)
{
   MPI_Iprobe(...,&flag,...);
   if (flag)
   {
      MPI_Recv(...);
      ...
   }
   // Do something, e.g. background tasks
}
实际MPI实现中的实际消息传输是相当复杂的事情。操作通常分为多个部分,然后排队。执行该部分称为progression,它在MPI库中的不同点完成,例如,当进行通信调用时,或者如果库实现后台progression线程,则在后台执行。调用
MPI_Iprobe
肯定会取得进展,但不能保证一次调用就足够了。MPI标准规定:

MPI_PROBE
MPI_IPROBE
的MPI实现需要保证进度:如果某个进程发出了对
MPI_PROBE
的调用,并且某个进程发起了与该探测匹配的发送,那么对
MPI_PROBE
的调用将返回,除非消息是由另一个并发接收操作(在探测过程中由另一个线程执行)接收的。类似地,如果进程繁忙地等待
MPI_IPROBE
并且发出了匹配的消息,则对
MPI_IPROBE
的调用最终将返回
flag=true
,除非该消息由另一个并发接收操作接收

注意finally的用法。进展是如何完成的,这是非常具体的实现。比较连续5次调用MPI_Iprobe的以下输出(原始代码+紧密循环):

打开MPI 1.6.5,不带进程线程:

# Run 1
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 2
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 3
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
未观察到同一MPI程序的多次执行之间的一致性,在第三次运行中,调用
MPI_Iprobe
5次后,标志仍为
false

英特尔MPI 4.1.2:

# Run 1
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 2
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 3
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
显然,“英特尔MPI”的进展与“开放MPI”有所不同

这两种实现之间的差异可以通过以下事实来解释:
MPI_Iprobe
被认为是一个微小的探针,因此它应该花费尽可能少的时间。另一方面,进程需要时间,在单线程MPI实现中,唯一可以进行进程的时间点是调用
MPI_Iprobe
(在特定情况下)。因此,MPI实现者必须决定每次调用
MPI_Iprobe
的实际进度,并在调用完成的工作量和所需时间之间取得平衡


使用
MPI\u探头
情况就不同了。这是一个阻塞调用,因此它能够不断进行,直到出现匹配的消息(更具体地说是它的信封)。

谢谢,我刚刚尝试按照您的建议用MPI_Send替换MPI_Isend,但由于某些原因,第一个MPI_Iprobe仍然返回false。(如果我将MPI_Isend与MPI_Wait一起使用,也会发生同样的情况)…关于信息,我已经在上面的编辑中添加了我的新代码和输出(在问题中)。
MPI_Iprobe
用于繁忙的等待循环。在操作正常进行之前,可能需要多次调用MPI_Iprobe<代码>MPI_探测
被阻塞,因此在操作进行到消息信封已被接收并匹配之前,它不会返回。
# Run 1
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 2
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 3
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
# Run 1
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 2
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1

# Run 3
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 0
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1
After MPI_Iprobe, flag = 1