Mpi 细分Scalapack网格_Mpi_Lapack_Scalapack_Blacs

Mpi 细分Scalapack网格

mpi

Mpi 细分Scalapack网格,mpi,lapack,scalapack,blacs,Mpi,Lapack,Scalapack,Blacs,我试图使用Scalapack计算大量大型矩阵的特征谱，而不是将每个矩阵分布在所有32个进程中。我宁愿将每个矩阵分布在4个进程中，并并行计算8个矩阵。我知道如何使用MPI_Comm_split细分MPI网格，但我认为Scalapack似乎不接受定制的通讯器。相反，它似乎使用了植根于PVM的BLACS网格如何在Scalapack中实现此细分这是通过BLACS和网格设置完成的参考函数是这些例程的文档说明：这些例程获取可用的进程，并将其分配或映射到BLACS进程网格中每个BLACS网

我试图使用Scalapack计算大量大型矩阵的特征谱，而不是将每个矩阵分布在所有32个进程中。我宁愿将每个矩阵分布在4个进程中，并并行计算8个矩阵。我知道如何使用MPI_Comm_split细分MPI网格，但我认为Scalapack似乎不接受定制的通讯器。相反，它似乎使用了植根于PVM的BLACS网格

如何在Scalapack中实现此细分

这是通过

BLACS

和网格设置完成的

参考函数是

这些例程的文档说明：

这些例程获取可用的进程，并将其分配或映射到BLACS进程网格中

每个BLACS网格都包含在一个上下文中（它自己的消息传递范围），因此它不会干扰其他网格/上下文中发生的分布式操作

可以重复调用这些网格创建例程，以定义其他上下文/网格

这意味着您可以创建8个不同的网格，并将每个

ICONTXT

传递给每个矩阵的scalapack例程

他们两人都有一个进退辩论

ICONTXT

（输入/输出）整数

输入时，一个整数句柄，指示在创建BLACS上下文时要使用的系统上下文。用户可以通过调用BLACS_GET获得默认系统上下文。在输出时，创建的BLACS上下文的整数句柄

您可以以相同的方式递归使用这些上下文。

我实现了@ztik suggestion，这就是我得出的结果。这似乎有效：

program main
    use mpi
    implicit none
    integer              :: ierr, me, nProcs, color, i,j,k, my_comm, dims(2), global_contxt
    integer              :: cnt, n_colors, map(2,2)
    integer, allocatable :: contxts(:)
    integer, parameter   :: group_size = 4

    call MPI_Init(ierr)

    call MPI_Comm_size(MPI_COMM_WORLD, nProcs, ierr)
    call MPI_Comm_rank(MPI_COMM_WORLD, me, ierr)

    color = get_color(group_size)
    n_colors =  nProcs / group_size
    allocate(contxts(n_colors))

    dims = calc_2d_dim(group_size)

    call BLACS_GET(0, 0, global_contxt)
    if(me == 0) write (*,*) global_contxt
    contxts = global_contxt

    do k = 1,n_colors 
        ! shift for each context
        cnt = group_size * (k-1)
        if(me==0) write (*,*) "##############", cnt

        ! create map
        do i=1,2
            do j=1,2
                map(i,j) = cnt
                cnt = cnt + 1
            enddo
        enddo

        call BLACS_GRIDMAP(contxts(k), map, 2, 2, 2)
        do i = 0,nProcs
            if(i == me) then
                write (*,*) me, contxts(k)
            endif
            call MPI_Barrier(MPI_COMM_WORLD, ierr)
        enddo
    enddo

    call MPI_Finalize(ierr)
contains
    function get_color(group_size) result(color)
        implicit none
        integer, intent(in)     :: group_size
        integer                 :: me, nProcs, color, ierr, i, cnt

        call MPI_Comm_size(MPI_COMM_WORLD, nProcs, ierr)
        call MPI_Comm_rank(MPI_COMM_WORLD, me, ierr)

        if(mod(nProcs, group_size) /= 0) then
            write (*,*) "Nprocs not divisable by group_size", mod(nProcs, group_size)
            call MPI_Abort(MPI_COMM_WORLD, 0, ierr)
        endif

        color = 0
        do i = 1,me
            if(mod(i, group_size) == 0) then
                color = color + 1
            endif
        enddo
    end function get_color

    function calc_2d_dim(sz) result(dim)
        implicit none
        integer, intent(in)    :: sz
        integer                :: dim(2), cand

        cand = nint(sqrt(real(sz)))

        do while(mod(sz, cand) /= 0)
            cand = cand - 1
        enddo
        dim(1) = sz/cand
        dim(2) = cand
    end function calc_2d_dim

end program main

如果我调用BLACS_GRIDINIT（上下文'R'，2,2），这些是否需要为MPI中的每个子通信器单独调用？如果我使用32个进程进行此调用，那么前4个就可以了，而较高的进程是-1，-1应该意味着第一个网格不使用这些进程。第二次调用BLACS_GRIDINIT（给出初始上下文），新进程应该重新分配到下一个网格。