Memory management gfortran for dummies:mcmodel=medium到底做了什么？_Memory Management_Fortran_X86 64_Gfortran

Memory management gfortran for dummies:mcmodel=medium到底做了什么？

memory-management fortran

Memory management gfortran for dummies:mcmodel=medium到底做了什么？,memory-management,fortran,x86-64,gfortran,Memory Management,Fortran,X86 64,Gfortran,我有一些代码在编译时给了我重新定位的错误，下面是一个例子，说明了这个问题： program main common/baz/a,b,c real a,b,c b = 0.0 call foo() print*, b end subroutine foo() common/baz/a,b,c real a,b,c integer, parameter :: nx = 450 integer, parameter :: ny = 144 int

我有一些代码在编译时给了我重新定位的错误，下面是一个例子，说明了这个问题：

  program main
  common/baz/a,b,c
  real a,b,c
  b = 0.0
  call foo()
  print*, b
  end

  subroutine foo()
  common/baz/a,b,c
  real a,b,c

  integer, parameter :: nx = 450
  integer, parameter :: ny = 144
  integer, parameter :: nz = 144
  integer, parameter :: nf = 23*3
  real :: bar(nf,nx*ny*nz)

  !real, allocatable,dimension(:,:) :: bar
  !allocate(bar(nf,nx*ny*nz))

  bar = 1.0
  b = bar(12,32*138*42)

  return
  end

使用

gfortran-O3-g-o test test.f

编译此文件时，我得到以下错误：

relocation truncated to fit: R_X86_64_PC32 against symbol `baz_' defined in COMMON section in /tmp/ccIkj6tt.o

但是如果我使用

gfortran-O3-mcmodel=medium-g-o test.f

，它就可以工作了。还要注意的是，如果我使数组可分配并在子例程中分配它，它就会工作

我的问题是

-mcmodel=medium

到底做了什么？我的印象是，代码的两个版本（带有可分配数组的版本和没有数组的版本）或多或少是等效的…

不，如果不使用

-mcmodel=medium

，大型静态数组（如您的

条形图）可能会超出限制。但可分配的当然更好。对于可分配项，只有数组描述符必须适合2GB，而不是整个数组
根据GCC参考：
-mcmodel=small
Generate code for the small code model: the program and its symbols must be linked in the lower 2 GB of the address space. Pointers are 64 bits. Programs can be statically or dynamically linked. This is the default code model. 
-mcmodel=kernel
Generate code for the kernel code model. The kernel runs in the negative 2 GB of the address space. This model has to be used for Linux kernel code. 
-mcmodel=medium
Generate code for the medium model: The program is linked in the lower 2 GB of the address space but symbols can be located anywhere in the address space. Programs can be statically or dynamically linked, but building of shared libraries are not supported with the medium model. 
-mcmodel=large
Generate code for the large model: This model makes no assumptions about addresses and sizes of sections. Currently GCC does not implement this model.

由于bar
相当大，编译器在堆栈上生成静态分配，而不是自动分配。静态数组是使用.comm
汇编指令创建的，该指令在所谓的公共部分中创建分配。收集来自该部分的符号，合并相同命名的符号（减少为一个符号请求，其大小等于请求的最大大小），然后以大多数可执行格式将剩余的映射到BSS（未初始化数据）部分。对于ELF可执行文件，.bss
部分位于数据段中，就在堆的数据段部分之前（还有另一个由匿名内存映射管理的堆部分，它不位于数据段中）
对于小型
内存型号，32位寻址指令用于对x86_64上的符号进行寻址。这使得代码更小，也更快。使用small
内存模型时的一些组件输出：
movl    $bar.1535, %ebx    <---- Instruction length saving
...
movl    %eax, baz_+4(%rip) <---- Problem!!
...
.local  bar.1535
.comm   bar.1535,2575411200,32
...
.comm   baz_,12,16

首先，使用64位立即移动指令（10字节长）将表示bar.1535
地址的64位值放入寄存器R10
。条.1535
符号的内存是使用.largecomm
指令分配的，因此它在ELF可执行文件的.lbss
部分结束.lbss
用于存储可能不适合前2个GiB的符号（因此不应使用32位指令或RIP相对寻址进行寻址），而较小的符号则转到.bss
（baz
仍使用.comm
分配，而不是.largecomm
）。由于在ELF链接器脚本中，.lbss
部分放在.bss
部分之后，因此使用32位RIP相关寻址不会导致无法访问baz

中描述了所有寻址模式。这是一本沉重的技术读物，但对于真正想了解64位代码如何在大多数x86_64 unix上工作的人来说，这是一本必读的书
当使用ALLOCATABLE
数组时，gfortran
分配堆内存（由于分配的大小很大，很可能实现为匿名内存映射）：
这基本上是RDI=malloc（2575411200）

。从那时起，通过使用存储在

RDI

中的值的正偏移量来访问

bar

的元素：

movl    51190040(%rdi), %eax
movl    %eax, baz_+4(%rip)

对于距离

条形码开始位置超过2 GiB的位置，使用更精细的方法。例如，要实现b=bar（12144*144*450）
gfortran
发射：
; Some computations that leave the offset in RAX
movl    (%rdi,%rax), %eax
movl    %eax, baz_+4(%rip)

这段代码不受内存模型的影响，因为对动态分配的地址不作任何假设。此外，由于没有传递数组，因此没有构建描述符。如果您添加另一个采用假定形状数组的函数并将bar
传递给它，则会将bar
的描述符创建为自动变量（即在foo
的堆栈上）。如果使用SAVE
属性将数组设置为静态，则描述符将放置在.bss
部分：
movl    $bar.1580, %edi
...
; RAX still holds the address of the allocated memory as returned by malloc
; Computations, computations
movl    -232(%rax,%rdx,4), %eax
movl    %eax, baz_+4(%rip)

第一步准备函数调用的参数（在我的示例中，callboo（bar）
其中boo
有一个接口，该接口声明它采用假定的形状数组）。它将bar
的数组描述符的地址移动到EDI
中。这是一个32位立即移动，因此描述符应位于前2个GiB中。实际上，它在小型
和中型
内存模型中都分配在.bss
中，如下所示：
movabsq $bar.1535, %r10
...
movl    %eax, baz_+4(%rip)
...
.local  bar.1535
.largecomm      bar.1535,2575411200,32
...
.comm   baz_,12,16

.local  bar.1580
.comm   bar.1580,72,32

我想，也许问题是“静态数组”和“可分配数组”之间有什么区别？我的印象是，在这两种情况下，它们都是从堆中分配的（尽管我应该承认，我谈论的是一些我不太了解的事情），我只是在你写作时编辑了答案。可分配表有一个描述符（带有附加数据的指针），只有这个描述符必须适合2GB。静态数组与任何其他静态变量一样完全位于静态段中。（可能静态段中只有一个指向描述符的指针，但它不会改变差异。）如果我理解正确，静态数组的2GB限制不再适用于mcmodel=small
。这是正确的吗？我认为它确实适用，它不适用于中型和大型。这是一个非常好的解释。谢谢这给了我一个很好的开始，让我可以更深入地研究这些东西（这就是我一直在寻找的）。@mgilson，为了回答的完整性，我还添加了对bar通过描述符传递给另一个子例程时会发生什么的解释。
.local  bar.1580
.comm   bar.1580,72,32