C++ MPI_Finalize中的总线错误
我正在为一个并行计算类编写一个MPI程序。我已经让代码正常工作,并且它输出了正确的结果,但是当我尝试用多个进程调用MPI_Finalize时,我得到了一个Buss错误。我通过Eclipse中的PTP环境在OSX上运行它。错误如下:C++ MPI_Finalize中的总线错误,c++,mpi,bus-error,C++,Mpi,Bus Error,我正在为一个并行计算类编写一个MPI程序。我已经让代码正常工作,并且它输出了正确的结果,但是当我尝试用多个进程调用MPI_Finalize时,我得到了一个Buss错误。我通过Eclipse中的PTP环境在OSX上运行它。错误如下: [Fruity:49034] *** Process received signal *** [Fruity:49034] Signal: Bus error (10) [Fruity:49034] Signal code: (2) [Fruity:49034] F
[Fruity:49034] *** Process received signal ***
[Fruity:49034] Signal: Bus error (10)
[Fruity:49034] Signal code: (2)
[Fruity:49034] Failing at address: 0x100336d7e
[Fruity:49034] [ 0] 2 libSystem.B.dylib 0x00007fff865cc1ba _sigtramp + 26
[Fruity:49034] [ 1] 3 ??? 0x0000000000000000 0x0 + 0
[Fruity:49034] [ 2] 4 libSystem.B.dylib 0x00007fff86570c27 tiny_malloc_from_free_list + 1196
[Fruity:49034] [ 3] 5 libSystem.B.dylib 0x00007fff8656fabd szone_malloc_should_clear + 242
[Fruity:49034] [ 4] 6 libopen-pal.0.dylib 0x0000000100187b9f opal_memory_base_open + 527
[Fruity:49034] [ 5] 7 libSystem.B.dylib 0x00007fff8656f98a malloc_zone_malloc + 82
[Fruity:49034] [ 6] 8 libSystem.B.dylib 0x00007fff8656dc88 malloc + 44
[Fruity:49034] [ 7] 9 libSystem.B.dylib 0x00007fff8657846d asprintf + 157
[Fruity:49034] [ 8] 10 libopen-rte.0.dylib 0x000000010013aebc orte_schema_base_get_job_segment_name + 108
[Fruity:49034] [ 9] 11 libopen-rte.0.dylib 0x000000010013d899 orte_smr_base_set_proc_state + 57
[Fruity:49034] [10] 12 libmpi.0.dylib 0x0000000100063758 ompi_mpi_finalize + 312
[Fruity:49034] [11] 13 Assignment31 0x0000000100002642 main + 491
[Fruity:49034] [12] 14 Assignment31 0x0000000100001688 start + 52
[Fruity:49034] *** End of error message ***
mpirun noticed that job rank 0 with PID 49033 on node Fruity.local exited on signal 15 (Terminated).
1 additional process aborted (not shown)
这是我的代码的主要功能。我确信这里有一些C++的坏实践(我多年来没有使用过它,但是它自学了),但是它确实输出了正确的值。如果我需要发布文件的其余部分,我可以这样做。我只是不想让这成为一个巨大的问题,如果有什么明显的错误
int main(int argc, char* argv[]){
/* start up MPI */
MPI_Init(&argc, &argv);
/* find out process rank */
MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
/* find out number of processes */
MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
/* find which nodes this processor is responsible for */
findStartAndEndPositions();
/*Intitialize the array to its starting values. */
initializeArray();
/*Find the elements that are dependent on outside processors */
findDependentElements();
MPI_Barrier(MPI_COMM_WORLD);
if(myRank == 0){
startTime = MPI_Wtime();
printArray();
}
int iter;
for(iter = 0; iter < NUM_ITERATIONS; iter++){
doCommunication();
MPI_Barrier(MPI_COMM_WORLD);
doIteration();
}
double check = computeCheck();
double receive = 0;
if(myRank == 0){
MPI_Reduce(&check, &receive, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
std::cout << "The total time was: " << MPI_Wtime() - startTime << " \n";
std::cout << "The checksum was: " << receive << " \n";
printArray();
}
else{
MPI_Reduce(&check, &receive, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
}
/* shut down MPI */
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
return 0;
}
intmain(intargc,char*argv[]){
/*启动MPI*/
MPI_Init(&argc,&argv);
/*找出工艺等级*/
MPI_Comm_rank(MPI_Comm_WORLD和myRank);
/*找出进程的数量*/
MPI通信大小(MPI通信世界和numProcs);
/*查找此处理器负责的节点*/
findStartAndEndPositions();
/*将数组初始化为其起始值*/
初始化array();
/*查找依赖于外部处理器的元素*/
findDependentElements();
MPI_屏障(MPI_通信世界);
如果(myRank==0){
startTime=MPI_Wtime();
printArray();
}
国际热核实验堆;
对于(iter=0;iter std::cout我不确定这是否是原因,但MPI_Reduce通常只有一行,不需要写两行。试试看是否有帮助
MPI_Reduce(&check, &receive, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
if(myRank == 0){
std::cout << "The total time was: " << MPI_Wtime() - startTime << " \n";
std::cout << "The checksum was: " << receive << " \n";
printArray();
}
MPI-Reduce(检查和接收,1,MPI-DOUBLE,MPI-SUM,0,MPI-COMM-WORLD);
如果(myRank==0){
std::不能。不过谢谢你的想法。恐怕你必须发布整个程序。否则我们看不到整个画面。我刚刚添加了一个函数,错误似乎是由该函数引起的。你的temp超出了使用范围,除非你的start==0,应该是temp[pos start]这解释了为什么它在非0级处理器上失败。谢谢!
MPI_Reduce(&check, &receive, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
if(myRank == 0){
std::cout << "The total time was: " << MPI_Wtime() - startTime << " \n";
std::cout << "The checksum was: " << receive << " \n";
printArray();
}