Fortran 嵌套循环，如何在外部循环上并行执行，而在内部循环上依次执行_Fortran_Openmp

Fortran 嵌套循环，如何在外部循环上并行执行，而在内部循环上依次执行

fortran

Fortran 嵌套循环，如何在外部循环上并行执行，而在内部循环上依次执行,fortran,openmp,Fortran,Openmp,一个简单的例子： module parameters implicit none integer :: i,step integer :: nt=5 integer :: nelectron=5 integer :: num_threads=2 real(8) :: vt=855555.0 real(8) :: dt=1.d-5 real(8) :: vx1_old,vy1_old,vz1_old,t1,t2,x_old,y_old real(8) :: x_

一个简单的例子：

module parameters
  implicit none
  integer :: i,step
  integer :: nt=5
  integer :: nelectron=5
  integer :: num_threads=2
  real(8) :: vt=855555.0
  real(8) :: dt=1.d-5
  real(8) :: vx1_old,vy1_old,vz1_old,t1,t2,x_old,y_old
  real(8) :: x_length=0.0
  real(8) :: y_length=0.0
  real(8) :: vx1_new,vy1_new,vz1_new,vzstore,x_new,y_new
end module parameters
program main
  use parameters
  use omp_lib
  implicit none
  integer :: thread_num

  !$ call omp_set_num_threads(num_threads)
  !$ call omp_set_nested(.false.)

  call cpu_time(t1)

  !$omp parallel
  !$omp& default(private) shared(x_length,y_length)
  !$omp& schedule(static,chunk)
  !$omp& reduction(+:x_length,y_length)
  !$omp do

  do i=1,nelectron

     do step=1,nt

        if(step==1)then           
           vx1_new=1.0
           vy1_new=1.0
           vz1_new=1.0
           x_new=1.0
           y_new=1.0 
        endif

        thread_num=omp_get_thread_num()
        write(*,*)"thread_num",thread_num
        write(*,*)"i",i
        write(*,*)"step",step
        write(*,*) 

        vx1_old=vx1_new
        vy1_old=vy1_new
        vz1_old=vz1_new
        x_old=x_new
        y_old=y_new

        x_length=x_length+x_old
        y_length=y_length+y_old
     enddo       
  enddo
  !$omp end do
  !$omp end parallel
  call cpu_time(t2)
  write(*,*)"x length=",x_length
  write(*,*)"y length=",y_length 
end program main

当我用I和step输出执行实际工作的线程时，我看到了一些奇怪的地方：

如您所见，线程0正在执行i=6，步骤=1，而线程1正在执行i=6，步骤=2。为什么它在执行相同的i=6时更改了线程？我怎样才能避免这种情况。这意味着对于每个i，内部循环步骤在同一个线程上完成。

在OpenMP中，只有最外层的循环是并行的，除非使用了collapse子句。这意味着内部循环的整个迭代：

 do step=1,nt

    if(step==1)then           
       vx1_new=1.0
       vy1_new=1.0
       vz1_new=1.0
       x_new=1.0
       y_new=1.0 
    endif

    thread_num=omp_get_thread_num()
    write(*,*)"thread_num",thread_num
    write(*,*)"i",i
    write(*,*)"step",step
    write(*,*) 

    vx1_old=vx1_new
    vy1_old=vy1_new
    vz1_old=vz1_new
    x_old=x_new
    y_old=y_new

    x_length=x_length+x_old
    y_length=y_length+y_old
 enddo

由单个线程以常量i顺序执行。线程得到一个i值，并在不与相邻线程进行任何交互的情况下，执行上面引用的整个i值

然而，正如我在评论中指出的，您的OpenMP指令语法是错误的。在这种特定情况下，它会导致竞争条件，而不是x_长度和y_长度的正确减少。它不会导致您怀疑存在的问题

你应该这样做

  !$omp parallel &
  !$omp& default(private)  &
  !$omp& shared(x_length,y_length)

  !$omp do schedule(static,5) reduction(+:x_length,y_length)

或者只是避免复杂的线条延续

  !$omp parallel default(private) shared(x_length,y_length)

  !$omp do schedule(static,5) reduction(+:x_length,y_length)

正如我在上面所评论的，不要信任并行程序的输出，除非您以某种方式处理了排序问题，并且永远不要将cpu时间用于并行程序。

永远不要在并行程序中使用cpu时间，也永远不要信任并行程序的输出排序。你必须遵守两条规则。而且，我从格夫特兰那里得到这个警告，你不明白吗？警告：$OMP在1开始一个注释行，因为它后面既没有空格，也不是嵌套的连续行。\u循环。f90:27.8:@VladimirF所以每个i的步循环是否在同一个线程上完成，这在我的情况下非常重要。请查看答案及其编辑。非常感谢我看到的