Fortran MPI_通信_繁殖导致死锁
我有一个MPI程序A需要生成,然后等待另一个MPI程序B完成。然后我需要再次繁殖并等待程序B 节目AFortran MPI_通信_繁殖导致死锁,fortran,mpi,Fortran,Mpi,我有一个MPI程序A需要生成,然后等待另一个MPI程序B完成。然后我需要再次繁殖并等待程序B 节目A IF (rank .eq. 0) THEN CALL MPI_COMM_SPAWN('prog_b', MPI_ARGV_NULL, size, & & MPI_INFO_NULL, 0, MPI_COMM_SELF, &
IF (rank .eq. 0) THEN
CALL MPI_COMM_SPAWN('prog_b', MPI_ARGV_NULL, size, &
& MPI_INFO_NULL, 0, MPI_COMM_SELF, &
& child_comm, MPI_ERRCODES_IGNORE, status)
WRITE (*,*) 'Parent 1 Before'
CALL MPI_BARRIER(child_comm, status)
WRITE (*,*) 'Parent 1 After'
... Change some things ...
CALL MPI_COMM_SPAWN('prog_b', MPI_ARGV_NULL, size, &
& MPI_INFO_NULL, 0, MPI_COMM_SELF, &
& child_comm, MPI_ERRCODES_IGNORE, status)
WRITE (*,*) 'Parent 2 Before'
CALL MPI_BARRIER(child_comm, status)
WRITE (*,*) 'Parent 2 After'
END IF
程序B
... Wait to finished ...
CALL MPI_COMM_GET_PARENT(parent_comm, error)
IF (parent_comm .ne. MPI_COMM_NULL) THEN
WRITE (*,*) 'Before'
CALL MPI_BARRIER(parent_comm, error)
WRITE (*,*) 'After'
END IF
... Finalize ...
当我运行这个程序时,程序B的第一次生成工作正常。但在第二个关卡上,两个程序都在第二个关卡上死锁。我每次生成16个程序b实例
输出
Parent Before 1
... Output of program b ...
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
After
After
Before
After
Before
After
After
After
After
After
Before
After
After
Parent After 1
After
After
After
After
After
After
... Second call to spawn ...
Parent Before 2
... Output of program b ...
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
Before
正如您所看到的,每个进程都会通过第一道屏障,但第二次,一切都会死锁。在第一次生成调用后,我尝试断开父通信和子通信。我尝试合并父通信和子通信,并对它们调用barrier,但似乎没有解决此死锁问题。首先,演示如何释放通信者。其次,检查错误返回。他们在那里是有原因的!当不再需要时,是否在对讲机上调用
MPI\u Comm\u disconnect()
。@RussF检查错误没有意义,因为如果发生错误,程序会在您进行任何更改检查之前终止。@Gillesgouailardet调用MPI\u Comm\u disconnect()
。