Unix 为什么ELF可执行文件可以有4个加载段?

Unix 为什么ELF可执行文件可以有4个加载段?,unix,llvm,elf,Unix,Llvm,Elf,有一个远程64位*nix服务器可以编译用户提供的代码(应该用Rust编写,但我认为这不重要,因为它使用LLVM)。我不知道它使用了哪个编译器/链接器标志,但编译的ELF可执行文件看起来很奇怪-它有4个加载段: $ echo "int foo; int main() { return 0;}" | clang -xc - -o a.out-bfd -fuse-ld=bfd $ readelf -Wl a.out-bfd Elf file type is EXEC (Exec

有一个远程64位*nix服务器可以编译用户提供的代码(应该用Rust编写,但我认为这不重要,因为它使用LLVM)。我不知道它使用了哪个编译器/链接器标志,但编译的ELF可执行文件看起来很奇怪-它有4个加载段:

$ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-bfd -fuse-ld=bfd
$ readelf -Wl a.out-bfd

Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R   0x8
  INTERP         0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R   0x1000
  LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
  LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R   0x1000
  LOAD           0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW  0x1000
  DYNAMIC        0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW  0x8
  NOTE           0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R   0x4
  GNU_EH_FRAME   0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
   03     .init .text .fini
   04     .rodata .eh_frame_hdr .eh_frame
   05     .init_array .fini_array .dynamic .got .got.plt .data .bss
   06     .dynamic
   07     .note.ABI-tag
   08     .eh_frame_hdr
   09
   10     .init_array .fini_array .dynamic .got
$readelf-e可执行文件
...
程序标题:
类型偏移量VirtAddr PhysAddr
FileSiz MemSiz标志对齐
...
加载0x0000000000 0x0000000000 0x0000000000 0x0000000000
0x0000000000004138 0x0000000000004138 R 0x1000
加载0x0000000000005000 0x0000000000005000 0x0000000000005000
0x00000000000305e9 0x00000000000305e9 R E 0x1000
加载0x000000000036000 0x000000000036000 0x000000000036000
0x000000000000d808 0x000000000000d808 R 0x1000
加载0x0000000000043da0 0x00000000000044da0 0x00000000000044da0
0x0000000000002290 0x00000000000024a0 RW 0x1000
...
在我自己的系统上,我正在查看的所有可执行文件只有2个加载段:

$ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-bfd -fuse-ld=bfd
$ readelf -Wl a.out-bfd

Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R   0x8
  INTERP         0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R   0x1000
  LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
  LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R   0x1000
  LOAD           0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW  0x1000
  DYNAMIC        0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW  0x8
  NOTE           0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R   0x4
  GNU_EH_FRAME   0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
   03     .init .text .fini
   04     .rodata .eh_frame_hdr .eh_frame
   05     .init_array .fini_array .dynamic .got .got.plt .data .bss
   06     .dynamic
   07     .note.ABI-tag
   08     .eh_frame_hdr
   09
   10     .init_array .fini_array .dynamic .got
程序头:
类型偏移量VirtAddr PhysAddr
FileSiz MemSiz标志对齐
...
加载0x0000000000 0x0000000000 0x0000000000 0x0000000000
0x0000000003000C0 0x0000000003000C0 R E 0x200000
加载0x00000000003002B00x00000000005002B00x00000000005002B0 0x00000000005002b0
0x00000000000776c8 0x000000000009b200 RW 0x200000
...
  • 在什么情况下(编译器/链接器版本、标志等),编译器可以构建具有4个加载段的ELF
  • 有4个荷载段有什么意义?我想,拥有一个具有读取而非执行权限的段可能有助于防止某些攻击,但为什么有两个这样的段呢

  • 典型的BFD ld或黄金链接Linux可执行文件有2个可加载段,其中
    ELF
    头与
    .text
    .rodata
    合并到第一个
    RE
    段中,而
    .data
    .bss
    和其他可写段合并到第二个
    RW
    段中

    以下是典型的段到段映射:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-gold -fuse-ld=gold
    $ readelf -Wl a.out-gold
    
    Elf file type is EXEC (Executable file)
    Entry point 0x400420
    There are 9 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R   0x8
      INTERP         0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0006b0 0x0006b0 R E 0x1000
      LOAD           0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001f8 0x000200 RW  0x1000
      DYNAMIC        0x000e28 0x0000000000401e28 0x0000000000401e28 0x0001b0 0x0001b0 RW  0x8
      NOTE           0x000254 0x0000000000400254 0x0000000000400254 0x000020 0x000020 R   0x4
      GNU_EH_FRAME   0x00067c 0x000000000040067c 0x000000000040067c 0x000034 0x000034 R   0x4
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
      GNU_RELRO      0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001e8 0x0001e8 RW  0x8
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .dynsym .dynstr .gnu.hash .hash .gnu.version .gnu.version_r .rela.dyn .init .text .fini .rodata .eh_frame .eh_frame_hdr
       03     .fini_array .init_array .dynamic .got .got.plt .data .bss
       04     .dynamic
       05     .note.ABI-tag
       06     .eh_frame_hdr
       07
       08     .fini_array .init_array .dynamic .got .got.plt
    
    这优化了内核加载此类可执行文件时必须执行的
    mmap
    的数量,但要以安全性为代价:
    .rodata
    中的数据不应该是可执行的,而应该是可执行的(因为它与
    .text
    合并,后者必须是可执行的)。这可能会显著增加试图劫持进程的人的攻击面

    较新的Linux系统,特别是使用
    LLD
    链接二进制文件,将安全性置于速度之上,并将ELF头和
    .rodata
    放在第一个
    R
    唯一段中,从而产生3个加载段,提高了安全性。以下是一个典型的映射:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-lld -fuse-ld=lld
    $ readelf -Wl a.out-lld
    
    Elf file type is EXEC (Executable file)
    Entry point 0x201000
    There are 10 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000230 0x000230 R   0x8
      INTERP         0x000270 0x0000000000200270 0x0000000000200270 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000558 0x000558 R   0x1000
      LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x000185 0x000185 R E 0x1000
      LOAD           0x002000 0x0000000000202000 0x0000000000202000 0x001170 0x002005 RW  0x1000
      DYNAMIC        0x003010 0x0000000000203010 0x0000000000203010 0x000150 0x000150 RW  0x8
      GNU_RELRO      0x003000 0x0000000000203000 0x0000000000203000 0x000170 0x001000 R   0x1
      GNU_EH_FRAME   0x000440 0x0000000000200440 0x0000000000200440 0x000034 0x000034 R   0x1
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
      NOTE           0x00028c 0x000000000020028c 0x000000000020028c 0x000020 0x000020 R   0x4
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .rodata .dynsym .gnu.version .gnu.version_r .gnu.hash .hash .dynstr .rela.dyn .eh_frame_hdr .eh_frame
       03     .text .init .fini
       04     .data .tm_clone_table .fini_array .init_array .dynamic .got .bss
       05     .dynamic
       06     .fini_array .init_array .dynamic .got
       07     .eh_frame_hdr
       08
       09     .note.ABI-tag
    
    不要落后,较新的BFD ld(我的版本是2.31.1)也使
    ELF
    头和
    .rodata
    只读,但无法将两个
    R
    只读段合并为一个,导致4个可加载段:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-bfd -fuse-ld=bfd
    $ readelf -Wl a.out-bfd
    
    Elf file type is EXEC (Executable file)
    Entry point 0x401020
    There are 11 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R   0x8
      INTERP         0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R   0x1000
      LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
      LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R   0x1000
      LOAD           0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW  0x1000
      DYNAMIC        0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW  0x8
      NOTE           0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R   0x4
      GNU_EH_FRAME   0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R   0x4
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
      GNU_RELRO      0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R   0x1
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
       03     .init .text .fini
       04     .rodata .eh_frame_hdr .eh_frame
       05     .init_array .fini_array .dynamic .got .got.plt .data .bss
       06     .dynamic
       07     .note.ABI-tag
       08     .eh_frame_hdr
       09
       10     .init_array .fini_array .dynamic .got
    

    最后,其中一些选择受BFD
    ld
    链接器选项的
    -(no)rosegment
    (或
    -Wl,z,noseparate code
    )影响。

    典型的BFD-ld或Gold-linked Linux可执行文件有两个可加载段,将
    ELF
    标题与
    .text
    .rodata
    合并到第一个
    RE
    段中,将
    .data
    .bss
    和其他可写部分合并到第二个
    RW
    段中

    以下是典型的段到段映射:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-gold -fuse-ld=gold
    $ readelf -Wl a.out-gold
    
    Elf file type is EXEC (Executable file)
    Entry point 0x400420
    There are 9 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R   0x8
      INTERP         0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0006b0 0x0006b0 R E 0x1000
      LOAD           0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001f8 0x000200 RW  0x1000
      DYNAMIC        0x000e28 0x0000000000401e28 0x0000000000401e28 0x0001b0 0x0001b0 RW  0x8
      NOTE           0x000254 0x0000000000400254 0x0000000000400254 0x000020 0x000020 R   0x4
      GNU_EH_FRAME   0x00067c 0x000000000040067c 0x000000000040067c 0x000034 0x000034 R   0x4
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
      GNU_RELRO      0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001e8 0x0001e8 RW  0x8
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .dynsym .dynstr .gnu.hash .hash .gnu.version .gnu.version_r .rela.dyn .init .text .fini .rodata .eh_frame .eh_frame_hdr
       03     .fini_array .init_array .dynamic .got .got.plt .data .bss
       04     .dynamic
       05     .note.ABI-tag
       06     .eh_frame_hdr
       07
       08     .fini_array .init_array .dynamic .got .got.plt
    
    这优化了内核加载此类可执行文件时必须执行的
    mmap
    的数量,但要以安全性为代价:
    .rodata
    中的数据不应该是可执行的,而应该是可执行的(因为它与
    .text
    合并,后者必须是可执行的)。这可能会显著增加试图劫持进程的人的攻击面

    较新的Linux系统,特别是使用
    LLD
    链接二进制文件,将安全性置于速度之上,并将ELF头和
    .rodata
    放在第一个
    R
    唯一段中,从而产生3个加载段,提高了安全性。以下是一个典型的映射:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-lld -fuse-ld=lld
    $ readelf -Wl a.out-lld
    
    Elf file type is EXEC (Executable file)
    Entry point 0x201000
    There are 10 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000230 0x000230 R   0x8
      INTERP         0x000270 0x0000000000200270 0x0000000000200270 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000558 0x000558 R   0x1000
      LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x000185 0x000185 R E 0x1000
      LOAD           0x002000 0x0000000000202000 0x0000000000202000 0x001170 0x002005 RW  0x1000
      DYNAMIC        0x003010 0x0000000000203010 0x0000000000203010 0x000150 0x000150 RW  0x8
      GNU_RELRO      0x003000 0x0000000000203000 0x0000000000203000 0x000170 0x001000 R   0x1
      GNU_EH_FRAME   0x000440 0x0000000000200440 0x0000000000200440 0x000034 0x000034 R   0x1
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
      NOTE           0x00028c 0x000000000020028c 0x000000000020028c 0x000020 0x000020 R   0x4
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .rodata .dynsym .gnu.version .gnu.version_r .gnu.hash .hash .dynstr .rela.dyn .eh_frame_hdr .eh_frame
       03     .text .init .fini
       04     .data .tm_clone_table .fini_array .init_array .dynamic .got .bss
       05     .dynamic
       06     .fini_array .init_array .dynamic .got
       07     .eh_frame_hdr
       08
       09     .note.ABI-tag
    
    不要落后,较新的BFD ld(我的版本是2.31.1)也使
    ELF
    头和
    .rodata
    只读,但无法将两个
    R
    只读段合并为一个,导致4个可加载段:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-bfd -fuse-ld=bfd
    $ readelf -Wl a.out-bfd
    
    Elf file type is EXEC (Executable file)
    Entry point 0x401020
    There are 11 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R   0x8
      INTERP         0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R   0x1000
      LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
      LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R   0x1000
      LOAD           0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW  0x1000
      DYNAMIC        0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW  0x8
      NOTE           0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R   0x4
      GNU_EH_FRAME   0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R   0x4
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
      GNU_RELRO      0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R   0x1
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
       03     .init .text .fini
       04     .rodata .eh_frame_hdr .eh_frame
       05     .init_array .fini_array .dynamic .got .got.plt .data .bss
       06     .dynamic
       07     .note.ABI-tag
       08     .eh_frame_hdr
       09
       10     .init_array .fini_array .dynamic .got
    

    最后,其中一些选择受BFD
    ld
    )链接器选项的
    -(no)玫瑰段影响。

    找到部分答案:默认情况下lld链接有3个段(R、R+E和R+W)。找到部分答案:默认情况下lld链接有3个段(R、R+E和R+W)。对于现代binutils(BFD)(v2.31.1+),使用
    -Wl,-z,noseparate code
    将可加载段的数量减少回2。外部参照:对于现代binutils(bfd)(v2.31.1+),使用
    -Wl,-z,noseparate code
    将可加载段的数量减少回2。外部参照: