C 为什么并行和串行版本的执行时间几乎相同
下面是C语言中的示例,比较串行和并行版本的执行时间。我过去常常测量执行时间。但执行时间大致相同。我的代码有问题吗?还是测量时间的方法 我的代码如下: 串行版本 结果是:C 为什么并行和串行版本的执行时间几乎相同,c,parallel-processing,mpi,cluster-computing,C,Parallel Processing,Mpi,Cluster Computing,下面是C语言中的示例,比较串行和并行版本的执行时间。我过去常常测量执行时间。但执行时间大致相同。我的代码有问题吗?还是测量时间的方法 我的代码如下: 串行版本 结果是: # serial PI = 3.142431 Serial execution time: 262699 microseconds # parallel MPI task 1 has started... MPI task 0 has started... MPI task 3 has started... MPI task
# serial
PI = 3.142431
Serial execution time: 262699 microseconds
# parallel
MPI task 1 has started...
MPI task 0 has started...
MPI task 3 has started...
MPI task 2 has started...
Real value of PI: 3.1415926535897
Parallel execution time: 294984 microseconds
并行化在哪里 串行版本在
5000000次迭代中计算pi。
在并行版本中,每个任务也执行50000*100
迭代,然后取平均值。
因此,并行版本可能“在统计上更准确”,但不会更快
另外,当我认为只需要一个时,您就可以使用500
MPI\u Reduce()
。
总之,我甚至感到惊讶的是“并行”版本并没有慢很多
如果您想通过并行化更快地运行,每个任务应该计算从5000000*taskid/numtasks
开始的5000000/numtasks
迭代,然后您应该发出一个MPI\u Reduce()
随机()
函数是在stdlib.h
中原型化的,所以在代码中提供原型是个糟糕的主意srandom()
函数是在stdlib.h
中原型化的,所以在代码中提供原型是个糟糕的主意。我首先要问,为什么我希望并行版本更快。并行化计算是否允许使用原本闲置的CPU资源?然后我会检查工作是否实际分布在CPU核上。@user3629249我不熟悉C atm。感谢您的建议,我将修改代码。您可以尝试通过355.0/113.0快速、准确地计算pi
,精确到小数位数正确!!!我只是运行了示例中的并行版本,但没有注意到这一点。感谢您提供有关MPI_Reduce()的建议。
/**********************************************************************
* FILE: mpi_pi_reduce.c
* OTHER FILES: dboard.c
* DESCRIPTION:
* MPI pi Calculation Example - C Version
* Collective Communication example:
* This program calculates pi using a "dartboard" algorithm. See
* Fox et al.(1988) Solving Problems on Concurrent Processors, vol.1
* page 207. All processes contribute to the calculation, with the
* master averaging the values for pi. This version uses mpc_reduce to
* collect results
* AUTHOR: Blaise Barney. Adapted from Ros Leibensperger, Cornell Theory
* Center. Converted to MPI: George L. Gusciora, MHPCC (1/95)
* LAST REVISED: 06/13/13 Blaise Barney
**********************************************************************/
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
void srandom (unsigned seed);
double dboard (int darts);
#define DARTS 50000 /* number of throws at dartboard */
#define ROUNDS 100 /* number of times "darts" is iterated */
#define MASTER 0 /* task ID of master task */
int main (int argc, char *argv[])
{
struct timeval tvalBefore, tvalAfter;
gettimeofday(&tvalBefore, NULL);
double homepi, /* value of pi calculated by current task */
pisum, /* sum of tasks' pi values */
pi, /* average of pi after "darts" is thrown */
avepi; /* average pi value for all iterations */
int taskid, /* task ID - also used as seed number */
numtasks, /* number of tasks */
rc, /* return code */
i;
MPI_Status status;
/* Obtain number of tasks and task ID */
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
printf ("MPI task %d has started...\n", taskid);
/* Set seed for random number generator equal to task ID */
srandom (taskid);
avepi = 0;
for (i = 0; i < ROUNDS; i++) {
/* All tasks calculate pi using dartboard algorithm */
homepi = dboard(DARTS);
/* Use MPI_Reduce to sum values of homepi across all tasks
* Master will store the accumulated value in pisum
* - homepi is the send buffer
* - pisum is the receive buffer (used by the receiving task only)
* - the size of the message is sizeof(double)
* - MASTER is the task that will receive the result of the reduction
* operation
* - MPI_SUM is a pre-defined reduction function (double-precision
* floating-point vector addition). Must be declared extern.
* - MPI_COMM_WORLD is the group of tasks that will participate.
*/
rc = MPI_Reduce(&homepi, &pisum, 1, MPI_DOUBLE, MPI_SUM,
MASTER, MPI_COMM_WORLD);
/* Master computes average for this iteration and all iterations */
if (taskid == MASTER) {
pi = pisum/numtasks;
avepi = ((avepi * i) + pi)/(i + 1);
//printf(" After %8d throws, average value of pi = %10.8f\n", (DARTS * (i + 1)),avepi);
}
}
if (taskid == MASTER) {
gettimeofday(&tvalAfter, NULL);
long tm = (tvalAfter.tv_sec - tvalBefore.tv_sec) * 1000000L + tvalAfter.tv_usec - tvalBefore.tv_usec;
printf("\nReal value of PI: 3.1415926535897 \n");
printf("Parallel execution time: %ld microseconds\n", tm);
}
MPI_Finalize();
return 0;
}
/**************************************************************************
* subroutine dboard
* DESCRIPTION:
* Used in pi calculation example codes.
* See mpi_pi_send.c and mpi_pi_reduce.c
* Throw darts at board. Done by generating random numbers
* between 0 and 1 and converting them to values for x and y
* coordinates and then testing to see if they "land" in
* the circle." If so, score is incremented. After throwing the
* specified number of darts, pi is calculated. The computed value
* of pi is returned as the value of this function, dboard.
*
* Explanation of constants and variables used in this function:
* darts = number of throws at dartboard
* score = number of darts that hit circle
* n = index variable
* r = random number scaled between 0 and 1
* x_coord = x coordinate, between -1 and 1
* x_sqr = square of x coordinate
* y_coord = y coordinate, between -1 and 1
* y_sqr = square of y coordinate
* pi = computed value of pi
****************************************************************************/
double dboard(int darts)
{
#define sqr(x) ((x)*(x))
long random(void);
double x_coord, y_coord, pi, r;
int score, n;
unsigned int cconst; /* must be 4-bytes in size */
/*************************************************************************
* The cconst variable must be 4 bytes. We check this and bail if it is
* not the right size
************************************************************************/
if (sizeof(cconst) != 4) {
printf("Wrong data size for cconst variable in dboard routine!\n");
printf("See comments in source file. Quitting.\n");
exit(1);
}
/* 2 bit shifted to MAX_RAND later used to scale random number between 0 and 1 */
cconst = 2 << (31 - 1);
score = 0;
/* "throw darts at board" */
for (n = 1; n <= darts; n++) {
/* generate random numbers for x and y coordinates */
r = (double)random()/cconst;
x_coord = (2.0 * r) - 1.0;
r = (double)random()/cconst;
y_coord = (2.0 * r) - 1.0;
/* if dart lands in circle, increment score */
if ((sqr(x_coord) + sqr(y_coord)) <= 1.0)
score++;
}
/* calculate pi */
pi = 4.0 * (double)score/(double)darts;
return(pi);
}
mpicc serial.c -o serial.o
mpicc parallel.c -o parallel.o
mpirun -n 1 serial.o
mpirun -np 4 -pernode parallel.o
# serial
PI = 3.142431
Serial execution time: 262699 microseconds
# parallel
MPI task 1 has started...
MPI task 0 has started...
MPI task 3 has started...
MPI task 2 has started...
Real value of PI: 3.1415926535897
Parallel execution time: 294984 microseconds