Fortran 应用程序在循环中调用过程（包含局部协数组）一段时间后抛出分段错误_Fortran_Intel Fortran_Fortran Coarrays

Fortran 应用程序在循环中调用过程（包含局部协数组）一段时间后抛出分段错误

fortran

Fortran 应用程序在循环中调用过程（包含局部协数组）一段时间后抛出分段错误,fortran,intel-fortran,fortran-coarrays,Fortran,Intel Fortran,Fortran Coarrays,我试图在循环中调用一个子例程。此子例程有一个本地数组。以下是我正在使用的代码： ! Test local coarray in procedure called in a loop. ! program main use, intrinsic :: iso_fortran_env, only : input_unit, output_unit, error_unit implicit none ! Variable declaration. integer :

我试图在循环中调用一个子例程。此子例程有一个本地数组。以下是我正在使用的代码：

! Test local coarray in procedure called in a loop.
!
program main
    use, intrinsic :: iso_fortran_env, only : input_unit, output_unit, error_unit

    implicit none

    ! Variable declaration.
    integer :: me, ti
    integer :: GHOST_WIDTH, TSTART, TSTEPS

    sync all

    ! Initialize.
    GHOST_WIDTH = 1
    TSTART = 0
    TSTEPS = 100000
    me = this_image()

    ! Iterate.
    do ti = TSTART + 1, TSTART + TSTEPS
        call Aldeal( GHOST_WIDTH )
        if ( me == 1 ) write( output_unit, * ) ti
    end do

    if ( me == 1 ) write( output_unit, * ) "All done!"

    contains
        subroutine Aldeal( width )
            integer, intent(in) :: width

            integer, allocatable, codimension[:] :: shell1_Co, shell2_Co, shell3_Co

            allocate( shell1_Co[*], shell2_Co[*], shell3_Co[*] )

            deallocate( shell1_Co, shell2_Co, shell3_Co )

            return
        end subroutine Aldeal
end program main

现在，子例程除了分配本地coarray并取消分配它之外，没有做任何事情。但即使在执行此操作时，经过一些迭代后，程序仍向我抛出以下错误：

forrtl: severe (174): SIGSEGV, segmentation fault occurred
In coarray image 1
Image              PC                Routine            Line        Source             
coarray_main       0000000000406063  Unknown               Unknown  Unknown
libpthread-2.17.s  00007F21D8B845F0  Unknown               Unknown  Unknown
libicaf.so         00007F21D90970D5  for_rtl_ICAF_CO_D     Unknown  Unknown
coarray_main       0000000000405054  main_IP_aldeal_            37  coarray_main.f90
coarray_main       0000000000404AEC  MAIN__                     23  coarray_main.f90
coarray_main       0000000000404A22  Unknown               Unknown  Unknown
libc-2.17.so       00007F21D85C5505  __libc_start_main     Unknown  Unknown
coarray_main       0000000000404929  Unknown               Unknown  Unknown

Abort(0) on node 0 (rank 0 in comm 496): application called MPI_Abort(comm=0x84000003, 0) - process 0

对于其他图像，同样的错误也会重复出现

第23行是主程序的do循环内的调用Aldeal（GHOST_WIDTH）。第37行对应于子例程中的

解除分配（shell1\u-Co，shell2\u-Co，shell3\u-Co）

语句

此外，如果我从子例程中删除deallocate语句，它会抛出相同的错误，但这次错误语句中的行号是23和39。第39行对应于

结束子例程Aldeal

语句

我无法理解我到底做错了什么。请帮忙

另外，我正在使用Centos 7和Intel（R）Parallel Studio XE 2019 Update 4 for Linux。

观察：

如果我将代码修改为具有可分配组件的派生类型，并使用该派生类型在子例程中创建coarray，代码将运行更长的时间，但最终会因错误而中止。修改内容如下：

module mod_coarray_error
    implicit none

    type :: int_t
        integer, allocatable, dimension(:) :: var
    end type int_t

    contains
        subroutine Aldeal_type( width )
            integer, intent(in) :: width

            type(int_t), allocatable, codimension[:] :: int_t_Co

            allocate( int_t_Co[*] )

            allocate( int_t_Co%var(width) )
            sync all

            ! deallocate( int_t_Co%var )
            deallocate( int_t_Co )

            return
        end subroutine Aldeal_type
end module mod_coarray_error


program main
    use, intrinsic :: iso_fortran_env, only : input_unit, output_unit, error_unit
    use :: mod_coarray_error

    implicit none

    ! Variable declaration.
    integer :: me, ti
    integer :: GHOST_WIDTH, TSTART, TSTEPS, SAVET

    sync all

    ! Initialize.
    GHOST_WIDTH = 3
    TSTART = 0
    TSTEPS = 100000
    SAVET = 1000
    me = this_image()

    ! Iterate.
    do ti = TSTART + 1, TSTART + TSTEPS
        sync all
        call Aldeal_type( GHOST_WIDTH )
        if ( mod( ti, SAVET ) == 0 ) then
            if ( me == 1 ) write( output_unit, * ) ti
        end if
    end do

    sync all

    if ( me == 1 ) write( output_unit, * ) "All done!"
end program main

此外，在Windows中编译时，此代码一直运行良好

现在，如果我添加编译器选项

heap arrays 0

，即使在Linux中，代码似乎也会一直运行到最后

我试图将代码中的循环数增加到

1e7

。即使如此，它仍然成功地运行到最后。但我观察到以下影响：

代码随着循环计数的增加而变慢，即从ti=1e6到ti=2e6的运行时间比从ti=1到ti=1e6的运行时间要长

程序使用的内存不断增加，即在程序运行开始时消耗

2GB

的每个图像，在

ti=2e6

时消耗

3.5GB

，在

ti=4e6

时消耗

4.7GB

，在

ti=6e6

时消耗

6GB

在Windows中运行时，程序使用的内存相对较少，但随着循环计数的增加，内存仍在不断增加。例如，每个图像在开始时消耗

100MB

，在

ti=2e6

时消耗

1.5GB

，在

ti=4e6

时消耗

2.5GB

，在

ti=6e6

时消耗

3.5GB

在Windows中使用编译器选项

/heap-arrays0

对运行（因为它已经在没有它的情况下成功运行）或运行时消耗的内存量都没有影响

问题中发布的原始代码即使在使用上述编译器选项编译时仍会抛出错误。它似乎也不会在Windows中运行

最终，我仍然对正在发生的事情感到困惑

另外，我在英特尔论坛上发布了这个问题，但尚未收到任何回复。

在这种情况下，您最好联系编译器供应商寻求帮助。我建议在英特尔论坛上发布，或使用您的支持合同。（下一版本的测试版也失败。）使用GNU Fortran 9.2+OpenCoarrays 2.7.1+OpenMPI 4.0.1成功运行到最后。@francescalus，谢谢。早些时候，我在英特尔论坛上发布了这个问题。但这篇文章尚未发表。因此我想把它贴在这里，希望我能在这里得到一些帮助。对@jacob。当使用gfortran编译时，它运行良好。但我的问题是CentOS缺乏OpenCoArray的支持。因此，我一直在使用英特尔Fortran编译器。作为一种解决方法（如果合适的话），您可以使用SAVE属性生成coarray局部变量，并进行任何必要的簿记。