Algorithm ASM算法译码_Algorithm_Assembly_64 Bit

Algorithm ASM算法译码

algorithm assembly

Algorithm ASM算法译码,algorithm,assembly,64-bit,Algorithm,Assembly,64 Bit,我试图理解ASM中的这个问题。代码如下： 45 33 C9 xor r9d, r9d C7 44 24 18 50 72 69 6D mov [rsp+arg_10], 6D697250h 66 C7 44 24 1C 65 53 mov [rsp+arg_14], 5365h C6 44 24 1E 6F mov [rsp+arg_16], 6Fh 4C 63 C1

我试图理解ASM中的这个问题。代码如下：

45 33 C9                    xor r9d, r9d 
C7 44 24 18 50 72 69 6D     mov [rsp+arg_10], 6D697250h 
66 C7 44 24 1C 65 53        mov [rsp+arg_14], 5365h 
C6 44 24 1E 6F              mov [rsp+arg_16], 6Fh 
4C 63 C1                    movsxd r8, ecx 
85 C9                       test ecx, ecx 
7E 1C                       jle short locret_140001342 
41 8B C9                    mov ecx, r9d 
            loc_140001329: 
48 83 F9 07                 cmp rcx, 7 
49 0F 4D C9                 cmovge rcx, r9 
48 FF C1                    inc rcx 
8A 44 0C 17                 mov al, [rsp+rcx+arg_F] 
30 02                       xor [rdx], al 
48 FF C2                    inc rdx 
49 FF C8                    dec r8 
75 E7                       jnz short loc_140001329 
            locret_140001342: 
C3                          retn

这是编码文本：

07 1D 1E 41 45 2A 00 25 52 0D 04 01 73 06 
24 53 49 39 0D 36 4F 35 1F 08 04 09 73 0E 
34 16 1B 08 16 20 4F 39 01 49 4A 54 3D 1B 
35 00 07 5C 53 0C 08 1E 38 11 2A 30 13 1F 
22 1B 04 08 16 3C 41 33 1D 04 4A

我已经学习ASM一段时间了，我知道大多数命令是什么，但是我仍然有一些问题没有找到答案

如何将编码文本插入算法？
什么是arg_10、arg_14等？我假设它们来自编码部分，但我不确切地知道

如果有人能逐行说明这个算法的作用，我理解其中一些，但我需要一些澄清

我一直使用Visual Studio和C++来测试ASM。我知道，要运行asm过程，可以声明这样的函数

extern "C" int function(int a, int b, int c,int d, int f, int g);

像这样使用它

printf("ASM Returned %d", function(92,2,3,4,5,6));

我还知道前四个参数进入int RCX、RDX、R8和R9，其余的在堆栈上。我对堆栈了解不多，所以现在不知道如何访问它们。我还知道返回的值是RAX包含的值。这样的一个数字会加上两个数字：

xor eax, eax
mov eax, ecx
add eax, edx
ret

因此，正如Jester所建议的，我将逐行解释我认为代码的作用

xor r9d, r9d                  //xor on r9d (clears the register)
mov [rsp+arg_10], 6D697250h   //moves 6D697250 to the address pointed at by rsp + arg_10
mov [rsp+arg_14], 5365h       //moves 5365 to the adress pointed at by rsp+arg_14
mov [rsp+arg_16], 6Fh         //moves 6F to the adress pointed at by rsp+arg_16
movsxd r8, ecx                //moves ecx, to r8 and sign extends it since exc is 32 bit and r8 is 64 bit
test ecx, ecx                 //tests exc and sets the labels
jle short locret_140001342    //jumps to ret if ecx is zero or less
mov ecx, r9d                  //moves the lower 32 bits or r9 into ecx

loc_140001329:                //label used by jump commands
cmp rcx, 7                    //moves 7(decimal) into rcx
cmovge rcx, r9                //don't know
inc rcx                       //increases rcx by 1
mov al, [rsp + rcx + arg_F]   //moves the the value at adress [rsp + rcx + arg_F] into al,  
                              //this is probably the key step as al is 1 byte and each character is also one byte, it is also the rax register so it holds the value to be returned
xor [rdx], al                 //xor on the value at address [rdx] and al, stores the result at the address of [rdx]
inc rdx                      //increase rdx by 1
dec r8                       //decrease r8 by 1
jnz short loc_140001329      //if r8 is not zero jump back to loc_140...
                             //this essentially is a while loop until r8 reaches 0 (assuming it starts as positive)
locret_140001342:
ret

我仍然不知道arg_xx是什么，也不知道编码文本到底是如何插入到这个算法中的。

我注意到的一件事是，存储在这些堆栈偏移处的值是ASCII：

>>> '5072696d65536f'.decode('hex')
'PrimeSo'

至于输入数据，您可以使用

xxd-r-p

并从程序中的stdin读取数据：

xxd-r-p data.hex./myprog

那些

arg_14

等偏移量必须在源中的某个位置声明。但我猜它们是十六进制偏移量0xf，0x10，0x14，0x16。

我认为你的理解基本上是正确的，还有一些小的修正：

更正1 这将设置标志（而不是标签）

更正2 这会将rcx与立即数值7进行比较，并相应地设置标志。（即，在此指令之后，只有当rcx大于7时，才会执行gt等条件指令。）

更正3 这会有条件地（基于刚才设置的标志）将r9移动到rcx中。条件为ge，因此此指令仅在rcx大于或等于7时执行。r9包含0，因此其效果是在达到7时将rcx设置回0

参数

我们没有向您提供有关函数参数的信息，但可以安全地假设rcx是要解密的数据的原始长度，rdx是指向数据的指针

以下是我对代码的看法

    ; rdx holds the message location
    ; ecx holds the message length

    xor r9d, r9d                ; r9d = 0
    mov [rsp+arg_10], 6D697250h ; fix up the key
    mov [rsp+arg_14], 5365h 
    mov [rsp+arg_16], 6Fh       ; which is "PrimeSo"
    movsxd r8, ecx              ; length counter
    test ecx, ecx               ; test the  message length
    jle short locret_140001342  ; skip if invalid length
    mov ecx, r9d                ; reset key index to 0
loc_140001329: 
    cmp rcx, 7                  ; check indexing of key
    cmovge rcx, r9              ; reset if o/range
    inc rcx                     ; obfusacte by incrementing first
    mov al, [rsp+rcx+arg_F]     ; ... and indexing wrong offset
    xor [rdx], al               ; encrypt the message byte
    inc rdx                     ; advance message pointer
    dec r8                      ; loop count
    jnz short loc_140001329     ; next message byte
locret_140001342: 
    retn

我用一个实现算法的C程序对消息进行了解码，但那太容易了，所以我不会发布它

逆向工程

代码中没有包含足够的信息，无法自顶向下解决该问题，因为有些寄存器在未加载的情况下使用，并且没有定义标签。我自下而上解决了这个问题，通过识别进行加密的指令，并从中得出结论

虽然堆栈标签没有定义，但命名法足以作为一条线索，表明键的各个部分实际上是连续的，而小端的假设揭示了键。查看十六进制字节表可以确认这一点，该表显示了存储在偏移量lsb处的三个值，分别为

、

1C

和

1E

好的，我已经找到了算法，并使其在ASM中工作。你们是对的，arg_xx是偏移量。arg_10==0x10，arg_f==0x0f。数据以数组的形式传入，数组的长度为。因此，在本例中，rcx将是数据长度47，rdx将指向数组的开头。这里是我用C++调用ASM程序的函数。p>

extern "C" void function(int length, char* message);

算法非常简单。关键短语是“PrimeSo”。它所做的只是对传入的每个值执行异或操作，其中一个值按递增顺序排列在“PrimeSo”中，一旦到达“PrimeSo”中的“o”，它就会返回到“P”。因此

cmp rcx, 7       
cmovge rcx, r9   //as Peter de Rivaz stated this will put 0 into rcx if it is greater or equal to seven
inc rcx

诸如此类

mov al, [rsp + rcx + 0Fh]

将有效地成为[rsp+1+0fh]、[rsp+2+0fh]、…、[rsp+7+0fh]。请注意，“PrimeSo”存储在[rsp+10h]处，这意味着[rsp+1+0Fh]指向“P”。在循环的每次迭代中，al将成为“PrimeSo”中的角色之一，并在其中循环

xor [rdx], al //This will do an xor operation on [rdx](begining of our message) and al wich is 'P' in the first loop.  
              //It will then store the result in it's place.  

inc rdx       //move to next character
dec r8        //decrease counter
jnz short loc_140001329 //and start the loop again

话虽如此，让我们看看前几个

xor P, 07 == xor 50, 07 --> 57 = W  
xor r, 1D == xor 72, 1D --> 6F = o  
xor i, 1E == xor 69, 1E --> 77 = w  
xor m, 41 == xor 6D, 41 --> 2C = ,

对于这里所知道的是C++代码：

#include <fstream>

extern "C" void function(int length, char* message);

int main()
{
    char message[] = { 0x07, 0x1D, 0x1E, 0x41, 0x45, 0x2A, 0x00, 0x25, 0x52, 0x0D, 0x04, 0x01, 0x73, 0x06, 0x24, 0x53, 0x49, 0x39, 0x0D, 0x36, 0x4F, 0x35, 0x1F, 0x08, 0x04, 0x09, 0x73, 0x0E, 0x34, 0x16, 0x1B, 0x08, 0x16, 0x20, 0x4F, 0x39, 0x01, 0x49, 0x4A, 0x54, 0x3D, 0x1B, 0x35, 0x00, 0x07, 0x5C, 0x53, 0x0C, 0x08, 0x1E, 0x38, 0x11, 0x2A, 0x30, 0x13, 0x1F, 0x22, 0x1B, 0x04, 0x08, 0x16, 0x3C, 0x41, 0x33, 0x1D, 0x04, 0x4A, '\0'};
    function(sizeof(message) - 1, message);
    printf("Decoded Message is:\n%s\n", message);


    printf("\n");
    system("pause");
    return 0;
}

在visual studio中，您可以在此处添加断点，然后转到调试->窗口->寄存器和调试->窗口->内存内存1以查看寄存器和程序内存。请注意，rcx将包含计数，rdx将指向编码消息的开头

谢谢大家的帮助和建议，没有你们我无法完成。

由于

arg\uuz

值是针对堆栈指针引用的，我假设它们是从堆栈检索参数的偏移量。您应该逐行进行操作，我们将纠正任何错误。另外，如果您想查看原始偏移量，请关闭反汇编程序中的“help”

arg_x

符号。我没有使用反汇编程序。我应该指出的是，我是在盯着看了一会儿后发现这个问题的，我用C解决了它。消息的一部分被解码为“哇，你做到了！”然后是一个更多的和一个电子邮件地址。我现在就报名！太好了。你能分享一下你是怎么做到的吗？我仍在试图弄清楚arg_10、arg_14等的含义。至于输入数据，您可以使用xxd-r-p并从程序中的stdin读取它：xxd-r-p data.hex |/myprog——我不知道那是什么means@Raptor2277：这是Unix/DOS shell语法，用于运行十六进制到二进制程序，并将输入传输到程序的stdin中

xxd

是一个相当标准的Unix程序。从二进制代码来看：

arg_10

是

0x18

。因此，

arg_x

名称与返回地址上方的位置匹配（在x86-64上需要8B）。因此

[rsp+arg_10]

是堆栈arg空间底部的

0x10

字节。奇怪的事

xor [rdx], al //This will do an xor operation on [rdx](begining of our message) and al wich is 'P' in the first loop.  
              //It will then store the result in it's place.  

inc rdx       //move to next character
dec r8        //decrease counter
jnz short loc_140001329 //and start the loop again

xor P, 07 == xor 50, 07 --> 57 = W  
xor r, 1D == xor 72, 1D --> 6F = o  
xor i, 1E == xor 69, 1E --> 77 = w  
xor m, 41 == xor 6D, 41 --> 2C = ,

#include <fstream>

extern "C" void function(int length, char* message);

int main()
{
    char message[] = { 0x07, 0x1D, 0x1E, 0x41, 0x45, 0x2A, 0x00, 0x25, 0x52, 0x0D, 0x04, 0x01, 0x73, 0x06, 0x24, 0x53, 0x49, 0x39, 0x0D, 0x36, 0x4F, 0x35, 0x1F, 0x08, 0x04, 0x09, 0x73, 0x0E, 0x34, 0x16, 0x1B, 0x08, 0x16, 0x20, 0x4F, 0x39, 0x01, 0x49, 0x4A, 0x54, 0x3D, 0x1B, 0x35, 0x00, 0x07, 0x5C, 0x53, 0x0C, 0x08, 0x1E, 0x38, 0x11, 0x2A, 0x30, 0x13, 0x1F, 0x22, 0x1B, 0x04, 0x08, 0x16, 0x3C, 0x41, 0x33, 0x1D, 0x04, 0x4A, '\0'};
    function(sizeof(message) - 1, message);
    printf("Decoded Message is:\n%s\n", message);


    printf("\n");
    system("pause");
    return 0;
}

.code

function proc
    xor r9d, r9d
    mov dword ptr [rsp + 18h], 6D697250h 
    mov word ptr [rsp + 1Ch], 5365h 
    mov byte ptr [rsp + 1Eh], 6Fh 
    movsxd r8, ecx
    test ecx, ecx
    jle short locret_140001342 
    mov ecx, r9d

loc_140001329:
    cmp rcx, 7
    cmovge rcx, r9
    inc rcx 
    mov al, [rsp + rcx + 17h]
    xor [rdx], al
    inc rdx
    dec r8
    jnz short loc_140001329

locret_140001342:
    ret

function endp
end