Messing with the stack in assembly and c++_问答_开发者

I want to do the following:

I have a function that is not mine (it really doesn't matter here but just to say that I don't have control over it) and that I want to patch so that it calls a function of mine, preserving the arguments list (jumping is not an option).

What I'm trying to do is, to put the stack pointer as it was before that function is called and then call mine (like going back and do again the same thing but with a different function). This doesn't work straight because the stack becomes messed up. I believe that when I do the call it replaces the return address. So, I did a step to preserve the return address saving it in a globally variable and it works but this is not ok because I want it to resist to recursitivy and you know what I mean. Anyway, i'm a newbie in assembly so that's why I'm here.

Please, don't tell me about already made software to do this because I want to make things my way.

Of course, this code has to be compiler and optimization independent.

My code (If it is bigger than what is acceptable please tell me how to post it):

// A function that is not mine but to which I have access and want to patch so that it calls a function of mine with its original arguments
void real(int a,int b,int c,int d)
{

}

// A function that I want to be called, receiving the original arguments
void receiver(int a,int b,int c,int d)
{
 printf("Arguments %d %d %d %d\n",a,b,c,d);
}

long helper;

// A patch to apply in the "real" function and on which I will call "receiver" with the same arguments that "real" received.
__declspec( naked ) void patch()
{
 _asm
 {
  // This first two instructions save the return address in a global variable
  // If I don't save and restore, the program won't work correctly.
  // I want to do this without having to use a global variable
  mov eax, [ebp+4]
  mov helper,eax

  push ebp
  mov ebp, esp

  // Make that the stack becomes as it were before the real function was called
  add esp, 8

  // Calls our receiver 
  call receiver

  mov esp, ebp
  pop ebp

  // Restores the return address previously saved
  mov eax, helper
  mov [ebp+4],eax

  ret
 }
}

int _tmain(int argc, _TCHAR* argv[])
{
 FlushInstructionCache(GetCurrentProcess(),&real,5);

 DWORD oldProtection;
 VirtualProtect(&real,5,PAGE_EXECUTE_READWRITE,&oldProtection);

 // Patching the real function to go to my patch
 ((unsigned char*)real)[0] = 0xE9;
 *((long*)((long)(real) + sizeof(unsigned char))) = (char*)patch - (char*)real - 5;

 // calling real function (I'm just calling it with inline assembly because otherwise it seems to works as if it were un patched
 // that is strange but irrelevant for this
 _asm
 {
  push 666
  push 1337
  push 69
  push 100
  call real
  add esp, 16
 }

 return 0;
}

Prints (and has to)开发者_开发问答:

Arguments 100 69 1337 666

Edit:

Code i'm testing following Vlad suggestion (Still not working)

// A patch to apply in the real function and on which I will call receiver with the same arguments that "real" received.
__declspec( naked ) void patch()
{
    _asm
    {
        jmp start

        mem:

        nop
        nop
        nop
        nop

        start :

        // This first two instructions save the return address in a global variable
        // If I don't save and restore the program won't work correctly.
        // I want to do this without having to use a global variable
        mov eax, [ebp+4]
        mov mem, eax

        push ebp
        mov ebp, esp

        // Make that the stack becomes as it were before the real function was called
        add esp, 8

        // Calls our receiver 
        call receiver

        mov esp, ebp
        pop ebp

        // Restores the return address previously saved
        mov eax, mem
        mov [ebp+4],eax

        ret
    }
}

The following code excerpts have been checked with mingw-g++, but should work in VC++ with minor modifications. The full sources are available from Launchpad: 1

The only way we can safely save call-specific data is to store it on the stack. One way is to rotate part of the stack.

patch.s excerpt (patchfun-rollstack):

sub esp, 4          # allocate scratch space

mov eax, DWORD PTR [esp+4]  # first we move down
mov DWORD PTR [esp], eax    # our return pointer

mov eax, DWORD PTR [esp+8]  # then our parameters
mov DWORD PTR [esp+4], eax
mov eax, DWORD PTR [esp+12]
mov DWORD PTR [esp+8], eax
mov eax, DWORD PTR [esp+16]
mov DWORD PTR [esp+12], eax
mov eax, DWORD PTR [esp+20]
mov DWORD PTR [esp+16], eax

mov eax, DWORD PTR [esp]    # save return pointer
mov DWORD PTR [esp+20], eax # behind arguments

add esp, 4          # free scratch space
call    __Z8receiveriiii

mov eax, DWORD PTR [esp+16] # restore return pointer
mov DWORD PTR [esp], eax

ret

We omitted the ebp here. If we add that, we will need to use 8 bytes of scratch space and save and restore the ebp as well as the eip. Note that when we restore the return pointer, we overwrite the a parameter. To avoid that we would need to rotate the stack back again.

The other way is to have the callee know about the extra data on the stack and ignore it.

patch.s (patchfun-ignorepointers):

push    ebp
mov ebp, esp
call    receiver
leave
ret

receiver.cc:

void receiver(const void *epb, const void *eip, int a,int b,int c,int d)
{
 printf("Arguments %d %d %d %d\n",a,b,c,d);
}

Here I included the epb, if you remove it from the asm all that remains is the call and the ret, and receiver would only need to accept and ignore the eip.

Of course, all of this is mostly for fun and curiosity. There really is no major advantage over the simple solution:

void patch(int a,int b,int c,int d)
{
 receiver(a,b,c,d);
}

The generated assembly would be shorter than our stack-roll, but would need 16 bytes more of stack, because the values are copied to a fresh area below the patch() stack frame.

(Actually the gcc-generated asm allocates 28 bytes on the stack even though it uses only 16. I'm not sure why. Maybe the extra 12 bytes are part of some stack-smashing protection scheme.)

You have once add esp, 8 and once add esp, 16. One of them must be wrong.

Edit:
Oh, I see, after add esp, 8 you must have removed from the stack ebp pushed 2 instructions before, and return address.

At [ebp+4] there must be return address of call to _tmain.

Edit2:
you can allocate an "internal" variable by something like that:

  call next
  dd 0
next:
  pop eax
  mov [eax], yourinfo

But still not clear why do we need to save that value at all.

Edit3: (removed, was wrong)

Edit4:
Another idea:

__declspec( naked ) void patch()
{
 _asm
 {
  call next
  // here we temporarily save the arguments
  dd   0
  dd   0
  dd   0
  dd   0
next:
  pop eax
  // eax points to the first dd

  // now store the args
  pop edx
  mov [eax], edx
  pop edx
  mov [eax+4], edx
  pop edx
  mov [eax+8], edx
  pop edx
  mov [eax+12], edx

  // now we can push the value
  mov edx, [ebp+4]
  push edx

  // now, push the args again
  mov edx, [eax+12]
  push edx
  mov edx, [eax+8]
  push edx
  mov edx, [eax+4]
  push edx
  mov edx, [eax]
  push edx

  // now continue with the old code
  // --------------------------------
  // restore the arguments    
  push ebp
  mov ebp, esp

  // Make that the stack becomes as it were before the real function was called
  add esp, 8

  // Calls our receiver 
  call receiver

  mov esp, ebp
  pop ebp

  // ----------------------------
  pop edx
  mov [ebp+4], edx

  ret
 }
}

This solution survives the recursion, but not simultaneous execution from 2 different threads.

I never used C++ for low-level stuff like this, so I won't go into the specifics of your example, but in general if you want to intercept a call and have your logic support recursion, you have two options: Either copy the entire stack frame (parameters) and call the 'hooked' original with the new copy, or if that's not feasible, maintain your own little stack for holding the original return value (e.g. as a linked list rooted) in a TLS-based data structure.

During normal execution, the operands of the function are pushed in reverse order onto the stack. When executing the call opcode, the processor first pushes the EIP (or CS/IP) register onto the stack. This is the return address. When execution reaches the function you want to replace, this is how the stock looks:

Return address 1
Operand 1
Operand 2
Operand 2

At this point you're going to call your own function which will have a stack like this:

Return address 2
Return address 1
Operand 1
Operand 2
Operand 3

Your function will need to know that there is an extra DWORD on the stack as it's doing what you want. This is easy to handle if you've written your replacement function assembly as well, just add 4 whenever your reference ESP. When you call RET in your function, the first return address will be popped and execution will return to the function you're replacing. The stack will once again be:

Return address 1
Operand 1
Operand 2
Operand 3

Calling RET in this function will once again pop the return address from the stack and return control to the calling function. This leaves your operands still on the stack resulting in corruption. I suggest calling RET with the number of function operands like this: