当我尝试发送2D int数组时，为什么MPI_Send会阻塞？_Mpi_Openmpi

当我尝试发送2D int数组时，为什么MPI_Send会阻塞？

mpi

当我尝试发送2D int数组时，为什么MPI_Send会阻塞？,mpi,openmpi,Mpi,Openmpi,我正在尝试用mpi执行一个分形图像并行计算。我将我的课程分为四部分：平衡每个级别处理的行数对列组的每行属性执行计算将行数和行数发送到列组0 处理排名为0的数据（对于测试，只需打印int）步骤1和2正在工作，但当我试图将行发送到秩0时，程序将停止并阻塞。我知道MPI_发送可能会阻止，但这里没有理由这样做以下是2个第一步：步骤1： #include <stdio.h> #include <stdlib.h> #include <string.h>

我正在尝试用mpi执行一个分形图像并行计算。我将我的课程分为四部分：

平衡每个级别处理的行数

对列组的每行属性执行计算

将行数和行数发送到列组0

处理排名为0的数据（对于测试，只需打印int）

步骤1和2正在工作，但当我试图将行发送到秩0时，程序将停止并阻塞。我知道MPI_发送可能会阻止，但这里没有理由这样做

以下是2个第一步：

步骤1：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Include the MPI library for function calls */
#include <mpi.h>

/* Define tags for each MPI_Send()/MPI_Recv() pair so distinct messages can be
 * sent */
#define OTHER_N_ROWS_TAG 0
#define OTHER_PIXELS_TAG 1

int main(int argc, char **argv) {
  const int nRows = 513;
  const int nCols = 513;
  const int middleRow = 0.5 * (nRows - 1);
  const int middleCol = 0.5 * (nCols - 1);
  const double step = 0.00625;
  const int depth = 100;
  int pixels[nRows][nCols];
  int row;
  int col;
  double xCoord;
  double yCoord;
  int i;
  double x;
  double y;
  double tmp;
  int myRank;
  int nRanks;
  int evenSplit;
  int nRanksWith1Extra;
  int myRow0;
  int myNRows;
  int rank;
  int otherNRows;
  int otherPixels[nRows][nCols];

  /* Each rank sets up MPI */
  MPI_Init(&argc, &argv);

  /* Each rank determines its ID and the total number of ranks */
  MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
  MPI_Comm_size(MPI_COMM_WORLD, &nRanks);
  printf("My rank is %d \n",myRank);
  evenSplit = nRows / nRanks;
  nRanksWith1Extra = nRows % nRanks;

/*Each rank determine the number of rows that he will have to perform (well balanced)*/
  if (myRank < nRanksWith1Extra) {

    myNRows = evenSplit + 1;
    myRow0 = myRank * (evenSplit + 1);
  }
  else {
    myNRows = evenSplit;
    myRow0 = (nRanksWith1Extra * (evenSplit + 1)) +
      ((myRank - nRanksWith1Extra) * evenSplit);
  }
/*__________________________________________________________________________________*/

步骤4：

/*________________________TREAT EACH ROW IN RANK 0_________________________________*/
  /* Only Rank 0 prints so the output is in order */
  if (myRank == 0) {

    /* Rank 0 loops over each rank so it can receive that rank's messages */
    for (rank = 0; rank < nRanks; rank++){

      /* Rank 0 receives the number of rows from the given rank so it knows how
       * many pixels to receive in the next message */
      MPI_Recv(&otherNRows, 1, MPI_INT, rank, OTHER_N_ROWS_TAG,
      MPI_COMM_WORLD, MPI_STATUS_IGNORE);

      /* Rank 0 receives the pixels array from each of the other ranks
       * (including itself) so it can print the number of iterations for each
       * pixel */
      MPI_Recv(&otherPixels, otherNRows * nCols, MPI_INT, rank,
          OTHER_PIXELS_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

      /* Rank 0 loops over the rows for the given rank */
      for (row = 0; row < otherNRows; row++) {

        /* Rank 0 loops over the columns within the given row */
        for (col = 0; col < nCols; col++) {

          /* Rank 0 prints the value of the pixel at the given row and column
           * followed by a comma */
          printf("%d,", otherPixels[row][col]);
        }

        /* In between rows, Rank 0 prints a newline character */
        printf("\n");
      }
    }
  }

  /* All processes clean up the MPI environment */
  MPI_Finalize();

  return 0;
}

/*.\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu_________________________________*/
/*仅打印秩0，因此输出是有序的*/
如果（myRank==0）{
/*秩0在每个秩上循环，以便它可以接收该秩的消息*/
对于（秩=0；秩


我想知道它为什么会堵塞，你能解释一下吗？
我是MPI的新用户，我想学习它，而不仅仅是为了有一个正常工作的程序
提前感谢。
当您在发送到列组0本身时使用阻塞发送/接收构造时，可能会导致死锁
从：
Source=允许目标，即进程可以向自身发送消息。（但是，使用上述阻止发送和接收操作是不安全的，
因为这可能会导致死锁。请参见第3.5节。）
可能的解决办法：

在向列组0本身发送/从列组0本身接收时，使用非阻塞发送/接收构造。有关更多信息，请查看、和例程
消除与等级0本身的通信。由于您处于0级，您已经有了一种方法来知道需要计算多少像素
MPI\u Send
根据标准a阻塞操作的定义
请注意，阻塞意味着：
它不会返回，直到消息数据和信封被安全地存储起来，以便发送者可以自由修改发送缓冲区。消息可以直接复制到匹配的接收缓冲区，也可以复制到临时系统缓冲区
试图让列组使用MPI\u send
和MPI\u Recv
向自身发送消息是一种死锁
针对您的情况的惯用模式是使用适当的集体通信操作MPI\u-Gather
和MPI\u-Gatherv
，如前面的回答所述，MPI\u-Send（）
可能会阻止
从理论上看，您的应用程序是不正确的，因为存在潜在的死锁（没有发送接收时，将秩0
MPI\u Send（）
传递给自身）
从非常实用的角度来看，MPI_Send（）
通常在发送小消息（如myNRows
）时立即返回，但在发送大消息（如pixels
）时会阻塞，直到发送匹配的接收。请记住

大小至少取决于MPI库和所使用的互连
从MPI
的角度来看，假设MPI\u Send（）
将立即返回小的消息是不正确的


如果确实想确保应用程序无死锁，只需将MPI\u Send（）
替换为MPI\u Ssend（）

回到你的问题，这里有几个选项

修改你的应用程序，使排名0
不与自身通信（所有信息都可用，因此不需要通信
在MPI\u Send（）
之前发布一个MPI\u Irecv（）
，并将MPI\u Recv（source=0）
替换为MPI\u Wait（）
修改你的应用程序，使rank0
不会MPI\u Send（）
也不会MPI\u Recv（source=0）
，而是MPI\u Sendrecv
。这是我建议的选项，因为你只需对通信模式（计算模式保持不变）做一点小的更改，这对我来说更优雅

/*__________________________SEND DATA TO RANK 0____________________________________*/

  /* Each rank (including Rank 0) sends its number of rows to Rank 0 so Rank 0
   * can tell how many pixels to receive */
  MPI_Send(&myNRows, 1, MPI_INT, 0, OTHER_N_ROWS_TAG, MPI_COMM_WORLD);
  printf("test \n");
  /* Each rank (including Rank 0) sends its pixels array to Rank 0 so Rank 0
   * can print it */
  MPI_Send(&pixels, sizeof(int)*myNRows * nCols, MPI_BYTE, 0, OTHER_PIXELS_TAG,
      MPI_COMM_WORLD);
  printf("enter ranking 0 \n");
/*_________________________________________________________________________________*/

/*________________________TREAT EACH ROW IN RANK 0_________________________________*/
  /* Only Rank 0 prints so the output is in order */
  if (myRank == 0) {

    /* Rank 0 loops over each rank so it can receive that rank's messages */
    for (rank = 0; rank < nRanks; rank++){

      /* Rank 0 receives the number of rows from the given rank so it knows how
       * many pixels to receive in the next message */
      MPI_Recv(&otherNRows, 1, MPI_INT, rank, OTHER_N_ROWS_TAG,
      MPI_COMM_WORLD, MPI_STATUS_IGNORE);

      /* Rank 0 receives the pixels array from each of the other ranks
       * (including itself) so it can print the number of iterations for each
       * pixel */
      MPI_Recv(&otherPixels, otherNRows * nCols, MPI_INT, rank,
          OTHER_PIXELS_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

      /* Rank 0 loops over the rows for the given rank */
      for (row = 0; row < otherNRows; row++) {

        /* Rank 0 loops over the columns within the given row */
        for (col = 0; col < nCols; col++) {

          /* Rank 0 prints the value of the pixel at the given row and column
           * followed by a comma */
          printf("%d,", otherPixels[row][col]);
        }

        /* In between rows, Rank 0 prints a newline character */
        printf("\n");
      }
    }
  }

  /* All processes clean up the MPI environment */
  MPI_Finalize();

  return 0;
}