当我尝试发送2D int数组时,为什么MPI_Send会阻塞?

当我尝试发送2D int数组时,为什么MPI_Send会阻塞?,mpi,openmpi,Mpi,Openmpi,我正在尝试用mpi执行一个分形图像并行计算。 我将我的课程分为四部分: 平衡每个级别处理的行数 对列组的每行属性执行计算 将行数和行数发送到列组0 处理排名为0的数据(对于测试,只需打印int) 步骤1和2正在工作,但当我试图将行发送到秩0时,程序将停止并阻塞。我知道MPI_发送可能会阻止,但这里没有理由这样做 以下是2个第一步: 步骤1: #include <stdio.h> #include <stdlib.h> #include <string.h>

我正在尝试用mpi执行一个分形图像并行计算。 我将我的课程分为四部分:

  • 平衡每个级别处理的行数
  • 对列组的每行属性执行计算
  • 将行数和行数发送到列组0
  • 处理排名为0的数据(对于测试,只需打印int)
  • 步骤1和2正在工作,但当我试图将行发送到秩0时,程序将停止并阻塞。我知道MPI_发送可能会阻止,但这里没有理由这样做

    以下是2个第一步:

    步骤1:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    /* Include the MPI library for function calls */
    #include <mpi.h>
    
    /* Define tags for each MPI_Send()/MPI_Recv() pair so distinct messages can be
     * sent */
    #define OTHER_N_ROWS_TAG 0
    #define OTHER_PIXELS_TAG 1
    
    int main(int argc, char **argv) {
      const int nRows = 513;
      const int nCols = 513;
      const int middleRow = 0.5 * (nRows - 1);
      const int middleCol = 0.5 * (nCols - 1);
      const double step = 0.00625;
      const int depth = 100;
      int pixels[nRows][nCols];
      int row;
      int col;
      double xCoord;
      double yCoord;
      int i;
      double x;
      double y;
      double tmp;
      int myRank;
      int nRanks;
      int evenSplit;
      int nRanksWith1Extra;
      int myRow0;
      int myNRows;
      int rank;
      int otherNRows;
      int otherPixels[nRows][nCols];
    
      /* Each rank sets up MPI */
      MPI_Init(&argc, &argv);
    
      /* Each rank determines its ID and the total number of ranks */
      MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
      MPI_Comm_size(MPI_COMM_WORLD, &nRanks);
      printf("My rank is %d \n",myRank);
      evenSplit = nRows / nRanks;
      nRanksWith1Extra = nRows % nRanks;
    
    /*Each rank determine the number of rows that he will have to perform (well balanced)*/
      if (myRank < nRanksWith1Extra) {
    
        myNRows = evenSplit + 1;
        myRow0 = myRank * (evenSplit + 1);
      }
      else {
        myNRows = evenSplit;
        myRow0 = (nRanksWith1Extra * (evenSplit + 1)) +
          ((myRank - nRanksWith1Extra) * evenSplit);
      }
    /*__________________________________________________________________________________*/
    
    步骤4:

    /*________________________TREAT EACH ROW IN RANK 0_________________________________*/
      /* Only Rank 0 prints so the output is in order */
      if (myRank == 0) {
    
        /* Rank 0 loops over each rank so it can receive that rank's messages */
        for (rank = 0; rank < nRanks; rank++){
    
          /* Rank 0 receives the number of rows from the given rank so it knows how
           * many pixels to receive in the next message */
          MPI_Recv(&otherNRows, 1, MPI_INT, rank, OTHER_N_ROWS_TAG,
          MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    
          /* Rank 0 receives the pixels array from each of the other ranks
           * (including itself) so it can print the number of iterations for each
           * pixel */
          MPI_Recv(&otherPixels, otherNRows * nCols, MPI_INT, rank,
              OTHER_PIXELS_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    
          /* Rank 0 loops over the rows for the given rank */
          for (row = 0; row < otherNRows; row++) {
    
            /* Rank 0 loops over the columns within the given row */
            for (col = 0; col < nCols; col++) {
    
              /* Rank 0 prints the value of the pixel at the given row and column
               * followed by a comma */
              printf("%d,", otherPixels[row][col]);
            }
    
            /* In between rows, Rank 0 prints a newline character */
            printf("\n");
          }
        }
      }
    
      /* All processes clean up the MPI environment */
      MPI_Finalize();
    
      return 0;
    }
    
    /*.\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu_________________________________*/
    /*仅打印秩0,因此输出是有序的*/
    如果(myRank==0){
    /*秩0在每个秩上循环,以便它可以接收该秩的消息*/
    对于(秩=0;秩
    我想知道它为什么会堵塞,你能解释一下吗? 我是MPI的新用户,我想学习它,而不仅仅是为了有一个正常工作的程序


    提前感谢。

    当您在发送到列组0本身时使用阻塞发送/接收构造时,可能会导致死锁

    从:

    Source=允许目标,即进程可以向自身发送消息。(但是,使用上述阻止发送和接收操作是不安全的, 因为这可能会导致死锁。请参见第3.5节。)

    可能的解决办法:

    • 在向列组0本身发送/从列组0本身接收时,使用非阻塞发送/接收构造。有关更多信息,请查看、和例程
    • 消除与等级0本身的通信。由于您处于0级,您已经有了一种方法来知道需要计算多少像素
    MPI\u Send
    根据标准a阻塞操作的定义

    请注意,阻塞意味着:

    它不会返回,直到消息数据和信封被安全地存储起来,以便发送者可以自由修改发送缓冲区。消息可以直接复制到匹配的接收缓冲区,也可以复制到临时系统缓冲区

    试图让列组使用
    MPI\u send
    MPI\u Recv
    向自身发送消息是一种死锁


    针对您的情况的惯用模式是使用适当的集体通信操作
    MPI\u-Gather
    MPI\u-Gatherv

    ,如前面的回答所述,
    MPI\u-Send()
    可能会阻止

    从理论上看,您的应用程序是不正确的,因为存在潜在的死锁(没有发送接收时,将秩
    0
    MPI\u Send()
    传递给自身)

    从非常实用的角度来看,
    MPI_Send()
    通常在发送小消息(如
    myNRows
    )时立即返回,但在发送大消息(如
    pixels
    )时会阻塞,直到发送匹配的接收。请记住

    • 大小至少取决于MPI库和所使用的互连
    • MPI
      的角度来看,假设
      MPI\u Send()
      将立即返回
      小的
      消息是不正确的
    如果确实想确保应用程序无死锁,只需将
    MPI\u Send()
    替换为
    MPI\u Ssend()

    回到你的问题,这里有几个选项

    • 修改你的应用程序,使排名
      0
      不与自身通信(所有信息都可用,因此不需要通信
    • MPI\u Send()
      之前发布一个
      MPI\u Irecv()
      ,并将
      MPI\u Recv(source=0)
      替换为
      MPI\u Wait()
    • 修改你的应用程序,使rank
      0
      不会
      MPI\u Send()
      也不会
      MPI\u Recv(source=0)
      ,而是
      MPI\u Sendrecv
      。这是我建议的选项,因为你只需对通信模式(计算模式保持不变)做一点小的更改,这对我来说更优雅
    /*__________________________SEND DATA TO RANK 0____________________________________*/
    
      /* Each rank (including Rank 0) sends its number of rows to Rank 0 so Rank 0
       * can tell how many pixels to receive */
      MPI_Send(&myNRows, 1, MPI_INT, 0, OTHER_N_ROWS_TAG, MPI_COMM_WORLD);
      printf("test \n");
      /* Each rank (including Rank 0) sends its pixels array to Rank 0 so Rank 0
       * can print it */
      MPI_Send(&pixels, sizeof(int)*myNRows * nCols, MPI_BYTE, 0, OTHER_PIXELS_TAG,
          MPI_COMM_WORLD);
      printf("enter ranking 0 \n");
    /*_________________________________________________________________________________*/
    
    /*________________________TREAT EACH ROW IN RANK 0_________________________________*/
      /* Only Rank 0 prints so the output is in order */
      if (myRank == 0) {
    
        /* Rank 0 loops over each rank so it can receive that rank's messages */
        for (rank = 0; rank < nRanks; rank++){
    
          /* Rank 0 receives the number of rows from the given rank so it knows how
           * many pixels to receive in the next message */
          MPI_Recv(&otherNRows, 1, MPI_INT, rank, OTHER_N_ROWS_TAG,
          MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    
          /* Rank 0 receives the pixels array from each of the other ranks
           * (including itself) so it can print the number of iterations for each
           * pixel */
          MPI_Recv(&otherPixels, otherNRows * nCols, MPI_INT, rank,
              OTHER_PIXELS_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    
          /* Rank 0 loops over the rows for the given rank */
          for (row = 0; row < otherNRows; row++) {
    
            /* Rank 0 loops over the columns within the given row */
            for (col = 0; col < nCols; col++) {
    
              /* Rank 0 prints the value of the pixel at the given row and column
               * followed by a comma */
              printf("%d,", otherPixels[row][col]);
            }
    
            /* In between rows, Rank 0 prints a newline character */
            printf("\n");
          }
        }
      }
    
      /* All processes clean up the MPI environment */
      MPI_Finalize();
    
      return 0;
    }