Assembly db指令，它在调用后执行_Assembly_X86_Nasm_Shellcode

Assembly db指令，它在调用后执行

assembly x86

Assembly db指令，它在调用后执行,assembly,x86,nasm,shellcode,Assembly,X86,Nasm,Shellcode,我正在学习汇编语言编程，我发现这段代码我无法理解指令是如何执行的 xor eax,eax xor ebx,ebx xor ecx,ecx xor edx,edx jmp short string code: pop ecx mov bl,1 mov dl,13 mov al,4 int 0x80 dec bl mov al,1 int 0x80 string: call code db 'hello, world!' 调用代码后，为什么要执行db指令？如果在之前执行调用指令，则该字符串永

我正在学习汇编语言编程，我发现这段代码我无法理解指令是如何执行的

xor eax,eax
xor ebx,ebx
xor ecx,ecx
xor edx,edx
jmp short string
code:
pop ecx 
mov bl,1
mov dl,13
mov al,4
int 0x80
dec bl
mov al,1
int 0x80
string: 
call code 
db 'hello, world!'

调用代码后，为什么要执行db指令？如果在

之前执行调用指令，则该字符串永远不会执行，因为您在到达该点之前发出了一个

exit\u程序

系统调用

这是正常代码的样子：

asm
asm
more asm
call subroutine   ->> will branch to subroutine
        subroutine:  asm
                     more asm
                     ret   ->> executing will return to point after call.
xor eax,eax          <<-- first instruction after the ret
ret                  <<-- return to caller.

更好的方法是将最后一部分编码如下：

.code
hello: db 'hello, world!'

.text
move ecx,hello
mov edx,13
call print         //returns as normal
call exit_program  //will not return

print: mov eax,4
mov ebx,1
int 0x80
ret

exit_program: 
xor ebx,ebx
mov eax,1
int 0x80

这有以下好处：

它的指令更少
它不会因为具有不匹配的调用/返回而导致速度缓慢
它不会弄乱字节寄存器
它简单易懂
您可以在程序的其他部分重用
```
打印
```
和
```
退出_程序
```
子例程

为了指出我所说的“以其他方式定义字节值”的意思，您的代码的这个变体将做同样的事情，但它显示了如何通过指令定义字符串，以及如何通过

db

指令定义指令。。。两者都使源代码更难为人类阅读，但对于汇编程序来说，差异可以忽略不计，它将生成相同的二进制机器代码，对于CPU来说，相同的机器代码就是相同的机器代码，它不关心源代码的外观

我还尝试对每一行进行广泛的注释，说明它的作用，以及为什么在代码中使用它

代码也是以这种非平凡的方式编写的，因为它是shell exploit有效负载的一个示例，在这个示例中，您的程序集不仅必须执行您想要的操作，而且其生成的机器代码还必须符合其他约束，例如它不能包含任何零（这使得它很难作为“字符串”传递）在使用一些漏洞注入有效负载代码期间，它必须是PIC（位置独立代码），并且它不能使用任何绝对地址，或者在执行时不能假设任何特定位置，等等

    ; sets basic registers eax,ebx,ecx,edx to zero (ecx not needed BTW)
    xor eax,eax
    db '1', 0xDB        ; xor ebx,ebx defined by "db" for fun
    db '1', 0xC9        ; xor ecx,ecx defined by "db" for fun
    xor edx,edx
    ; short-jump forward to make later "call code" to produce
    ; negative relative offset, so zero in "call" opcode is avoided
    ; "call code" from here would need zeroes in rel32 offset encoding
    jmp short string    ; the "jmp short string" is encoded as "EB 0F"
code:
    pop ecx             ; loads the address of string from the stack into ecx
    mov bl,1            ; ebx = 1 = STD_OUT stream, avoiding zeroes in
        ; "mov ebx,1" opcode, so instead "xor ebx,ebx mov bl,1" is used
    mov dl,13           ; edx = 13 = length of string
    mov al,4            ; eax = 4 = sys_write
    int 0x80            ; sys_write(STD_OUT, 'hello, world!', 13);
    dec bl              ; ebx = 0 = exit code "OK"
    mov al,1            ; eax = 1 = sys_exit
    int 0x80            ; sys_exit(0);
string:
    call code           ; return address == string address -> pushed on stack
    ; also "code:" is ahead, so relative offset is negative => no zero in opcode
    ; resulting call opcode is "E8 EC FF FF FF"

    ; following bytes are NOT executed as code, they contain string data
    push 0x6f6c6c65     ; 'hello'
    sub al,0x20         ; ', '
    ja  short $+0x6f+2  ; 'wo'
    jb  short $+0x6c+2  ; 'rl'
    db 'd!'

为了编译，我确实使用了

nasm-f elf*.asm；ld-m elf_i386-s-o demo*.o

（忽略警告），向后反编译并检查实际机器代码如何形成指令，您可以应用

objdump-m intel-d demo

（上面的代码和

objdump

也适用于在线站点：如果您想测试它）

没有

db

“指令”

db

是对汇编程序的一个指令，它告诉汇编程序将一个或多个字节放入二进制文件（即可执行文件/对象文件）中。您认为为什么会执行它？

调用将把CPU重定向到地址code
。。在调用之后，没有重定向回代码。（提示：调用
用于将字符串地址存储到堆栈顶部，而不是“调用子例程”）为什么运行该指令？不应该转到调用后的代码call
将返回地址存储在堆栈上，在本例中，返回地址恰好是字符串'hello，world！'的地址。然后将该地址弹出到ecx
（这是系统调用4的字符串指针参数）。正如我在前面的评论中所说，这是一个模糊的代码，初学者尝试从这样的代码中学习是没有意义的。你知道你在汇编程序中使用的教程是用于外壳代码和漏洞利用的吗？如果您没有编写外壳代码和进行漏洞利用，那么使用不同的教程可能会更好。如果你不做外壳代码，那么我的新标签应该从问题中删除。JFYI：我猜，问题中的原始代码是外壳利用代码，因此它包含代码中的数据，只使用相对寻址和“PIC”，避免产生的机器代码中的零字节（您的mov-edx，13在编译后将包含零，这使得in不适合作为有效负载字符串），因此这很可能就是代码以这种“模糊”方式编写的原因。
    ; sets basic registers eax,ebx,ecx,edx to zero (ecx not needed BTW)
    xor eax,eax
    db '1', 0xDB        ; xor ebx,ebx defined by "db" for fun
    db '1', 0xC9        ; xor ecx,ecx defined by "db" for fun
    xor edx,edx
    ; short-jump forward to make later "call code" to produce
    ; negative relative offset, so zero in "call" opcode is avoided
    ; "call code" from here would need zeroes in rel32 offset encoding
    jmp short string    ; the "jmp short string" is encoded as "EB 0F"
code:
    pop ecx             ; loads the address of string from the stack into ecx
    mov bl,1            ; ebx = 1 = STD_OUT stream, avoiding zeroes in
        ; "mov ebx,1" opcode, so instead "xor ebx,ebx mov bl,1" is used
    mov dl,13           ; edx = 13 = length of string
    mov al,4            ; eax = 4 = sys_write
    int 0x80            ; sys_write(STD_OUT, 'hello, world!', 13);
    dec bl              ; ebx = 0 = exit code "OK"
    mov al,1            ; eax = 1 = sys_exit
    int 0x80            ; sys_exit(0);
string:
    call code           ; return address == string address -> pushed on stack
    ; also "code:" is ahead, so relative offset is negative => no zero in opcode
    ; resulting call opcode is "E8 EC FF FF FF"

    ; following bytes are NOT executed as code, they contain string data
    push 0x6f6c6c65     ; 'hello'
    sub al,0x20         ; ', '
    ja  short $+0x6f+2  ; 'wo'
    jb  short $+0x6c+2  ; 'rl'
    db 'd!'