Parallel processing 在Fortran中与OMP并行重叠并反向更新数组？_Parallel Processing_Fortran_Openmp

Parallel processing 在Fortran中与OMP并行重叠并反向更新数组？

parallel-processing fortran

Parallel processing 在Fortran中与OMP并行重叠并反向更新数组？,parallel-processing,fortran,openmp,Parallel Processing,Fortran,Openmp,下面是我正在为我的项目编写的一个稍加修改的代码片段，我在test1,2,3例程中遇到了一个奇怪的并行问题，其中的数字有时是错误的： integer, parameter :: N=6 integer, parameter :: chunk_size=3 integer, dimension(1:N) :: a,b,c contains subroutine array_setup implicit none integer :: i do i=1,N

下面是我正在为我的项目编写的一个稍加修改的代码片段，我在test1,2,3例程中遇到了一个奇怪的并行问题，其中的数字有时是错误的：

  integer, parameter :: N=6
  integer, parameter :: chunk_size=3
  integer, dimension(1:N) :: a,b,c

contains

  subroutine array_setup
    implicit none
    integer :: i

    do i=1,N
       a(i)=2*i
       b(i)=i*i
       c(i)=i*i-i+2
    end do

    return
  end subroutine array_setup

  subroutine test1
    implicit none
    integer :: i

    !$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
    do i=2,N
       a(i-1)=b(i)
       c(i)=a(i)
    end do
    !$OMP end parallel do

    return
  end subroutine test1

  subroutine test2
    implicit none
    integer :: i

    !$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
    do i=2,N
       a(i-1)=b(i)
       a(i)=c(i)
    end do
    !$OMP end parallel do

    return
  end subroutine test2

  subroutine test3
    implicit none
    integer :: i

    !$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
    do i=2,N
       b(i)=a(i-1)
       a(i)=c(i)
    end do
    !$OMP end parallel do

    return
  end subroutine test3

end program vectorize_test

下面是OMP_NUM_THREADS=1时的输出示例，这是正确的：

 after setup
           1           2           1           2
           2           4           4           4
           3           6           9           8
           4           8          16          14
           5          10          25          22
           6          12          36          32
 after test1
           1           4           1           2
           2           9           4           4
           3          16           9           6
           4          25          16           8
           5          36          25          10
           6          12          36          12
 after test2
           1           4           1           2
           2           9           4           4
           3          16           9           8
           4          25          16          14
           5          36          25          22
           6          32          36          32
 after test3
           1           2           1           2
           2           4           2           4
           3           8           4           8
           4          14           8          14
           5          22          14          22
           6          32          22          32

但是，当我将线程数增加到1以上时，每列中都会出现奇怪的数字变化，从而导致输出不正确，我的错误在哪里，我可以做些什么来修复它

!$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
do i=2,N
   a(i-1)=b(i)
   c(i)=a(i)
end do
!$OMP end parallel do

您可以有一个线程读取值

a（i）

，该值尚未计算，因为它是为其他线程安排的。循环迭代依赖于上一个循环。你不能这样把它并行化。您还可以让一个线程读取其他线程正在写入的相同

a（i）

位置。这也是一个错误（竞争条件）

循环中

!$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
do i=2,N
   a(i-1)=b(i)
   a(i)=c(i)
end do
!$OMP end parallel do

迭代也不是独立的。请注意，

a（i）

的大多数位置将在下一次迭代中被覆盖。同样，两个线程可能会按照执行这两个操作的顺序发生冲突。姚可以安全地将此改写为

a(1) = b(2)
!$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
do i=2,N
   a(i)=c(i)
end do
!$OMP end parallel do

第三圈

!$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
do i=2,N
   b(i)=a(i-1)
   a(i)=c(i)
end do
!$OMP end parallel do

具有与第一个循环相同的问题。每个迭代取决于上一个迭代的值。这不容易并行化。您必须找到一种重写算法的方法，这样迭代就不会相互依赖

请注意，每个子例程中的

返回

没有nead。如果在父作用域中有

隐式无

，则在每个子例程中也不需要它。

!$OMP parallel do private(i) shared(a,b,c) schedule(static,chunk_size)
do i=2,N
   a(i-1)=b(i)
   c(i)=a(i)
end do
!$OMP end parallel do