Fortran MPI_Win_create存在基本RMA问题，当大小为非零时，参数Null base pointer中的Null指针无效_Fortran_Window_Mpi

Fortran MPI_Win_create存在基本RMA问题，当大小为非零时，参数Null base pointer中的Null指针无效

fortran mpi

Fortran MPI_Win_create存在基本RMA问题，当大小为非零时，参数Null base pointer中的Null指针无效,fortran,window,mpi,Fortran,Window,Mpi,我正在使用Fortran 90和最新稳定版本的MPICH（3.3）我希望有一个MPI_窗口在根进程上公开一个数组，并且通信器中的所有其他进程调用MPI_将数组复制到它们自己的“本地”副本中不幸的是，在（base…）的非根进程中提供MPI_BOTTOM作为“base”参数会导致错误 MPI_Win_create（192）：MPI_Win_create（base=（nil），size=0，disp_unit=1275069467，MPI_INFO_NULL，MPI_COMM_WORLD，Win=

我正在使用Fortran 90和最新稳定版本的MPICH（3.3）

我希望有一个MPI_窗口在根进程上公开一个数组，并且通信器中的所有其他进程调用MPI_将数组复制到它们自己的“本地”副本中

不幸的是，在（base…）的非根进程中提供MPI_BOTTOM作为“base”参数会导致错误

MPI_Win_create（192）：MPI_Win_create（base=（nil），size=0，disp_unit=1275069467，MPI_INFO_NULL，MPI_COMM_WORLD，Win=0x7ffcb343d9fc）失败

MPI_Win_create（156）：当大小为非零时，参数Null base pointer中的Null指针无效

我一直在编写一个教科书示例，第61页图3.2，使用高级MPI，消息传递接口的现代功能，Gropp，Hoefler，Thakur，Lusk

除了

MPI\u-BOTTOM

之外，我应该使用什么样的

kind（MPI\u-ADDRESS\u-kind）

？这是在一个进程上初始化MPI_窗口的正确方法吗？该进程实际上没有公开它的内部内存，只是访问另一个进程的内存

显然，将base的参数更改为已分配（非null）数组是可行的，但这会更改后面的GET的行为，使其无法工作（创建无效的内存访问）

我不知道为什么运行时错误特别指出空基指针对于非零大小无效，因为我在调用

mpi\u win\u create（mpi\u BOTTOM，0，mpi\u INTEGER，…）

时明确指定大小为0

下面是我为这个例子编写的所有代码。它设置缓冲区并尝试为每个进程创建窗口。对

MPI\u Fence

的两次调用之间有一个注释掉的部分，这是所有非根进程尝试获取的部分

program main
  use mpi

  implicit none
  integer :: ierr, procno, nprocs, comm

  integer, allocatable :: root_data(:), local_data(:)
  integer, parameter :: root = 0, NUM_ELEMENTS = 10

  integer :: win

  integer :: i

  !======================================

  call mpi_init(ierr)
  comm = mpi_comm_world
  call mpi_comm_rank(comm, procno, ierr)
  call mpi_comm_size(comm, nprocs, ierr)

  !======================================
  if (procno .eq. root) then
    allocate(root_data(1:NUM_ELEMENTS))
    do i=1,NUM_ELEMENTS
      root_data(i) = i
    enddo

    call mpi_win_create(root_data, NUM_ELEMENTS, MPI_INTEGER, &
                        MPI_INFO_NULL, comm, win, ierr)
  else
    allocate(local_data(1:NUM_ELEMENTS))
    local_data = 0
    call mpi_win_create(MPI_BOTTOM, 0, MPI_INTEGER, &
                        MPI_INFO_NULL, comm, win, ierr)
  endif

  !======================================
  call mpi_win_fence(0, win, ierr)
 
  !if (procno .ne. root) then 
  !  call mpi_get(local_data, NUM_ELEMENTS, MPI_INTEGER, &
  !               root, 0, NUM_ELEMENTS, MPI_INTEGER, &
  !               win, ierr)
  !endif

  call mpi_win_fence(0, win, ierr)
  !======================================

  if (procno .ne. root) then
    print *, "proc", procno
    print *, local_data
  endif

  !======================================
  call MPI_Win_free(win, ierr)

  call mpi_finalize(ierr)
end program main

预期结果是每个进程打印其版本的

本地\u数据

，在本例中，由于MPI\u Get被注释掉，该版本应为10个0

我遇到了运行时错误。

MPI\u Win\u create（）的

size

参数的类型为

INTEGER（KIND=MPI\u ADDRESS\u KIND）

然后，我能够使用MPICH 3.3和最新的开放MPI成功运行修改后的版本

program main
  use mpi

  implicit none
  integer :: ierr, procno, nprocs, comm

  integer, allocatable :: root_data(:), local_data(:)
  integer, parameter :: root = 0
  integer (KIND=MPI_ADDRESS_KIND) :: NUM_ELEMENTS = 10, zero = 0

  integer :: win

  integer :: i

  !======================================

  call mpi_init(ierr)
  comm = mpi_comm_world
  call mpi_comm_rank(comm, procno, ierr)
  call mpi_comm_size(comm, nprocs, ierr)

  !======================================
  if (procno .eq. root) then
    allocate(root_data(1:NUM_ELEMENTS))
    do i=1,NUM_ELEMENTS
      root_data(i) = i
    enddo

    call mpi_win_create(root_data, NUM_ELEMENTS, MPI_INTEGER, MPI_INFO_NULL, comm, win, ierr)
  else
    allocate(local_data(1:NUM_ELEMENTS))
    local_data = 0
    call mpi_win_create(MPI_BOTTOM, zero, MPI_INTEGER, MPI_INFO_NULL, comm, win, ierr)
  endif

  !======================================
  call mpi_win_fence(0, win, ierr)

  !if (procno .ne. root) then 
  !  call mpi_get(local_data, NUM_ELEMENTS, MPI_INTEGER, &
  !               root, 0, NUM_ELEMENTS, MPI_INTEGER, &
  !               win, ierr)
  !endif

  call mpi_win_fence(0, win, ierr)
  !======================================

  if (procno .ne. root) then
    print *, "proc", procno
    print *, local_data
  endif

  !======================================
  call MPI_Win_free(win, ierr)

  call mpi_finalize(ierr)
end program main

MPI\u Win\u create（）

的

size

参数的类型为

INTEGER（KIND=MPI\u ADDRESS\u KIND）

然后，我能够使用MPICH 3.3和最新的开放MPI成功运行修改后的版本

program main
  use mpi

  implicit none
  integer :: ierr, procno, nprocs, comm

  integer, allocatable :: root_data(:), local_data(:)
  integer, parameter :: root = 0
  integer (KIND=MPI_ADDRESS_KIND) :: NUM_ELEMENTS = 10, zero = 0

  integer :: win

  integer :: i

  !======================================

  call mpi_init(ierr)
  comm = mpi_comm_world
  call mpi_comm_rank(comm, procno, ierr)
  call mpi_comm_size(comm, nprocs, ierr)

  !======================================
  if (procno .eq. root) then
    allocate(root_data(1:NUM_ELEMENTS))
    do i=1,NUM_ELEMENTS
      root_data(i) = i
    enddo

    call mpi_win_create(root_data, NUM_ELEMENTS, MPI_INTEGER, MPI_INFO_NULL, comm, win, ierr)
  else
    allocate(local_data(1:NUM_ELEMENTS))
    local_data = 0
    call mpi_win_create(MPI_BOTTOM, zero, MPI_INTEGER, MPI_INFO_NULL, comm, win, ierr)
  endif

  !======================================
  call mpi_win_fence(0, win, ierr)

  !if (procno .ne. root) then 
  !  call mpi_get(local_data, NUM_ELEMENTS, MPI_INTEGER, &
  !               root, 0, NUM_ELEMENTS, MPI_INTEGER, &
  !               win, ierr)
  !endif

  call mpi_win_fence(0, win, ierr)
  !======================================

  if (procno .ne. root) then
    print *, "proc", procno
    print *, local_data
  endif

  !======================================
  call MPI_Win_free(win, ierr)

  call mpi_finalize(ierr)
end program main

谢谢这很有道理。在后面对MPI_Get的调用中（在OP中被注释掉），我对

调用MPI_Get（…，root，MPI_BOTTOM，…）

做了一个更改，这导致为每个进程从本地_数据（：）数组打印1:NUM_元素的预期输出。但我真的不明白为什么会发生这种情况。是否因为在

根进程上提供的共享内存区域中，数组位于共享内存区域的开头？最好的情况是，SamI猜您指的是int target\u rank
后面的MPI Aint target\u disp
参数。该标准要求偏移量（相对于窗口的开头）而不是地址。严格地说，MPI\u BOTTOM
在这里是一个不正确的值，但幸运的是它相当于NULL
（在您使用的MPI库中）。谢谢！这很有道理。在后面对MPI_Get的调用中（在OP中被注释掉），我对调用MPI_Get（…，root，MPI_BOTTOM，…）
做了一个更改，这导致为每个进程从本地_数据（：）数组打印1:NUM_元素的预期输出。但我真的不明白为什么会发生这种情况。是否因为在根进程上提供的共享内存区域中，数组位于共享内存区域的开头？最好的情况是，SamI猜您指的是int target\u rank
后面的MPI Aint target\u disp
参数。该标准要求偏移量（相对于窗口的开头）而不是地址。严格地说，MPI\u BOTTOM
在这里是一个不正确的值，但幸运的是它相当于NULL
（在您使用的MPI库中）。