Multithreading OpenMP和MPI混合动态调度_Multithreading_Mpi_Openmp

Multithreading OpenMP和MPI混合动态调度

multithreading mpi

Multithreading OpenMP和MPI混合动态调度,multithreading,mpi,openmp,Multithreading,Mpi,Openmp,随着线程数的增加，临时线程数减少。。当我将线程数发送为1时，它给出了正确的答案，但随着线程数的增加，运行时间缩短，但给出了错误的答案 #include <stdio.h> #include <mpi.h> #include <complex.h> #include <time.h> #include <omp.h> #define MAXITERS 1000 // globals int count = 0; int nptssi

随着线程数的增加，临时线程数减少。。当我将线程数发送为1时，它给出了正确的答案，但随着线程数的增加，运行时间缩短，但给出了错误的答案

#include <stdio.h>
#include <mpi.h>
#include <complex.h>
#include <time.h>
#include <omp.h>

#define MAXITERS 1000

// globals
int count = 0;
int nptsside;
float side2;
float side4;
int temp = 0;

int inset(double complex c) {
   int iters;
   float rl,im;
   double complex z = c;
   for (iters = 0; iters < MAXITERS; iters++) { 
      z = z*z + c;
      rl = creal(z);
      im = cimag(z);
      if (rl*rl + im*im > 4) return 0;
   }
   return 1;
}

int main(int argc, char **argv)
{
   nptsside = atoi(argv[1]);
   side2 = nptsside / 2.0;
   side4 = nptsside / 4.0;

   //struct timespec bgn,nd;
   //clock_gettime(CLOCK_REALTIME, &bgn);

   int x,y; float xv,yv;
  double complex z;
  int i;
  int mystart, myend;
  int nrows;
  int nprocs, mype;
  int data;


  MPI_Status status;
  MPI_Init(&argc,&argv);
  MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
  MPI_Comm_rank(MPI_COMM_WORLD, &mype);
  nrows = nptsside/nprocs;
  printf("%d\n", nprocs);

  mystart = mype*nrows;
  myend = mystart + nrows - 1;


  #pragma omp parallel shared(mystart, myend, temp)
  {
  int nth = omp_get_num_threads();
  printf("%d\n", nth);
  #ifdef STATIC
  #pragma omp for reduction(+:temp) schedule(static)
  #elif defined DYNAMIC
  #pragma omp for reduction(+:temp) schedule(dynamic)
  #elif defined GUIDED
  #pragma omp for reduction(+:temp) schedule(guided)
  #endif
  for (x=mystart; x<=myend; x++) {  

     for ( y=0; y<nptsside; y++)  {
        xv = (x - side2) / side4;
        yv = (y - side2) / side4;
        z = xv + yv*I;
        if (inset(z)) {
           temp++;
        }
     }
  }
  }


  if(mype==0) {
     count += temp;
     printf("%d\n", temp);

     for (i = 1; i < nprocs; i++) {
        MPI_Recv(&temp, 1, MPI_INT, i, 0, MPI_COMM_WORLD, &status);
        count += temp;
        printf("%d\n", temp);
        }
        }
        else{
        MPI_Send(&temp, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
        }



  MPI_Finalize();

  if(mype==0) {
  printf("%d\n", count);
  }

   //clock_gettime(CLOCK_REALTIME, &nd);
   //printf("%f\n",timediff(bgn,nd));
}

当您进入OpenMP循环时，您没有定义任何私有变量

首先，您必须始终声明OpenMP循环的循环计数器以及OpenMP循环专用内部嵌套循环的任何循环计数器

其次，有三个变量xv、yv和z，每个变量都依赖于这些循环中的迭代。因此，每个线程也需要有自己的这些变量的私有副本。将并行语句更改为

#pragma omp parallel shared(mystart, myend, temp) private(x, y, xv, yv, z)

应该修复您的OpenMP问题

正如您所说，将线程数设置为1会得到正确的答案，我没有查看您的MPI代码

编辑：好吧，我撒谎了，我现在简单地查看了你的MPI代码。您应该编写一个reduce，而不是所有的发送和接收。这个集合将比您当前设置的阻塞通信快得多

MPI_Reduce(&temp, &count, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

问题是？问题是为什么当线程数增加时会给出错误的答案。。。例如，如果线程数为1。。它给出了1000，但是当线程数增加时，当正确的counttemp为1000时，它给出了200或300。非常感谢。。我还有一个问题。为什么添加更多线程会将运行时间减少大约一半，但添加更多进程不会像Openmp那样减少运行时间。。有什么原因吗？也许启动流程的开销是不值得的，因为您所做的工作太少了？或者也许你在最后进行的交流比你想象的要花费更多的时间？我不能肯定。