Intro
I’m new to everything demonstrated below, so be forewarned of potential errors. This technique of drawing the stack was something a professor hammered into the class this year. Let’s dive in!
Author Assigned Level: Newbie
Community Assigned Level:
- Newbie
- Wannabe
- Hacker
- Wizard
- Guru
0 voters
Required Skills
Since C and Assembly are used, C and Assembly would be useful to know before reading. The examples, however, should be simple enough for any beginner to read and understand from context.
C
Nothing fancy here:
#include <stdio.h>
int callMe(int num1, int num2);
int main() {
int a = 0;
int b = 2;
int c = callMe(a, b);
printf("%d\n", c);
return 0;
}
int callMe(int num1, int num2) {
int ans = num1+num2;
return ans;
}
Assembly
; main()
<+0>: push rbp
<+1>: mov rbp,rsp
<+4>: sub rsp,0x10
<+8>: mov DWORD PTR [rbp-0xc],0x0
<+15>: mov DWORD PTR [rbp-0x8],0x2
<+22>: mov edx,DWORD PTR [rbp-0x8]
<+25>: mov eax,DWORD PTR [rbp-0xc]
<+28>: mov esi,edx
<+30>: mov edi,eax
<+32>: call 0x68f <callMe>
<+37>: mov DWORD PTR [rbp-0x4],eax
<+40>: mov eax,DWORD PTR [rbp-0x4]
<+43>: mov esi,eax
<+45>: lea rdi,[rip+0xb6] # 0x734
<+52>: mov eax,0x0
<+57>: call 0x530 <printf@plt>
<+62>: mov eax,0x0
<+67>: leave
<+68>: ret
; callMe()
<+0>: push rbp
<+1>: mov rbp,rsp
<+4>: mov DWORD PTR [rbp-0x14],edi
<+7>: mov DWORD PTR [rbp-0x18],esi
<+10>: mov edx,DWORD PTR [rbp-0x14]
<+13>: mov eax,DWORD PTR [rbp-0x18]
<+16>: add eax,edx
<+18>: mov DWORD PTR [rbp-0x4],eax
<+21>: mov eax,DWORD PTR [rbp-0x4]
<+24>: pop rbp
<+25>: ret
In short, main()
passes a
and b
to callMe()
through edi
and esi
, which returns a + b
in eax
.
The stack
Now that we can see the assembly, we can draw out exactly how the stack looks and have a better understanding of what our code is doing. A practical use for drawing the stack is for debugging, but it’s also a fun exercise (which is what spawned this topic)!
The stack grows from high addresses to low addresses, top to bottom. Typically, we reason about the stack such that the low addresses are on top; in this sense, we will be using an “inverted stack” in our depictions.
main()
: lines <+0>
and <+1>
<+0>: push rbp
<+1>: mov rbp,rsp
The very first stack modification is push rbp
, from main()
. rbp
is a 64 bit register, so it takes up 4 rows on the stack (each row is word-sized, which is 2 bytes):
< high addresses >
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP and RSP (main)
|------|
| |
< low addresses >
rbp
is used as a helper register for performing stack functions. It is used as a “base” when traversing up or down through the stack. Using rbp
in this manner is also a way to setup and use local variables, which we will talk about later on.
A common analogy for this technique is “dropping an anchor”. rsp
changes constantly, as the stack grows and shrinks; copying its current value to rbp
“anchors” it for however you want to use it, while still allowing you to push and pop.
Moving on!
main()
: line <+4>
<+4>: sub rsp,0x10
This instruction makes space on the stack to store local variables, in our case int a
and int b
, both 32-bits. Now I know I said we shouldn’t mess with rsp
, but moving the stack pointer is how you can create local variables.
Since a
and b
are both integers, we need to allocate a minimum of 64 bits on the stack. On this line, we can see that 16 bytes have been allocated, by subtracting 0x10
from rsp
's current value. Here is what our stack looks like after this instruction:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
| | -0x2
|------|
| | -0x4
|------|
| | -0x6
|------|
| | -0x8
|------|
| | -0xa
|------|
| | -0xc
|------|
| | -0xe
|------|
| | <-- RSP (main)
|------|
| |
Now that we have this space allocated, rsp
is once again “fixed” to it’s current location; we can use rbp
to access the spaces allocated by rsp
.
As we will see, this is done by subtracting some number of bytes, 0xn, such that [rsp-0xn]
points to the last byte. How far up to read is determined by another factor, which we will see below.
main()
: lines <+8>
- <+25>
<+8>: mov DWORD PTR [rbp-0xc],0x0
<+15>: mov DWORD PTR [rbp-0x8],0x2
<+22>: mov edx,DWORD PTR [rbp-0x8]
<+25>: mov eax,DWORD PTR [rbp-0xc]
Within the first two lines, we are initializing our local variables, a
and b
, to 0
and 2
, respectively.
When accessing the stack, the assembler cannot infer the size of the object we want to retrieve (or set); so we have to specify it ourselves. DWORD PTR [rbp-0xc]
tells the assembler “hey, we want you to look at the value at [rbp-0xc]
and read 4 bytes (2 words, specified by DWORD
)”.
When we are accessing the stack, I mentioned that [rsp-0xn]
, for some hex n
, points to the “last byte”: this is specifically the last byte of the size you specify. So for DWORD PTR [rbp-0xc]
, the assembler goes to [rbp-0xc]
on the stack, and then reads “up” 4 bytes.
With that said, here’s how our stack looks now that some values have been initialized:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
| | -0x2
|------|
| | -0x4
|------|
| 0000 | -0x6 \
|------| | int b = 2;
| 0002 | -0x8 /
|------|
| 0000 | -0xa \
|------| | int a = 0;
| 0000 | -0xc /
|------|
| | -0xe
|------|
| | <-- RSP (main)
|------|
| |
- Note: for the purposes of this tutorial, we will not take into account endianness
As you can clearly see, we have initialized some local variables on the stack without modifying rsp
or rbp
.
The other two lines do not modify the stack, but simply retrieve values from it. This notation is of the same manner that adding things to the stack is, except the memory operand is the 2nd operand of the instruction (and not the 1st).
The important thing to note is that we have copied a
into eax
and b
into edx
.
Onward!
main()
: lines <+28>
- <+30>
<+28>: mov esi,edx
<+30>: mov edi,eax
<+32>: call 0x68f <callMe>
These first two lines are nothing special: we are moving a
into edi
and b
into esi
. These two registers are used to pass the arguments to the callMe()
function, which is what happens next.
When the call
instruction is executed, it pushes the address of the next instruction onto the stack (rip
); it then copies the address of the called procedure into rip
. This modifies our stack:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
| | -0x2
|------|
| | -0x4
|------|
| 0000 | -0x6 \
|------| | int b = 2;
| 0002 | -0x8 /
|------|
| **** | * int a was overwritten,
|------| as it is never used again
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
| |
At this point, we are now in callMe()
.
callMe()
: lines <+0>
and <+1>
<+0>: push rbp
<+1>: mov rbp,rsp
This should look familiar…
We are once again dropping an anchor, this time saving “old” rbp
's value (since we use it again after this function returns). Here’s how our stack looks now:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| |
|------|
| |
|------|
| 0000 |
|------|
| 0002 |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (callMe)
|------|
| |
One important thing to notice here is that we didn’t modify rsp
. This is because we want the stack to “look the same” when we enter and leave a function. This will be covered in a bit more detail later on.
callMe()
: lines <+4>
- <+13>
<+4>: mov DWORD PTR [rbp-0x14],edi
<+7>: mov DWORD PTR [rbp-0x18],esi
<+10>: mov edx,DWORD PTR [rbp-0x14]
<+13>: mov eax,DWORD PTR [rbp-0x18]
These instructions should also look familiar, as they were executed earlier almost verbatim. The important difference is that in the first two lines here, we are initializing int num1
and int num2
with the values inside edi
and esi
, respectively.
Here is how our stack looks after these instructions:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| |
|------|
| |
|------|
| 0000 |
|------|
| 0002 |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP, RSP (callMe)
|------|
| | -0x2
|------|
| | -0x4
|------|
| | -0x6
|------|
| | -0x8
|------|
| | -0xA
|------|
| | -0xC
|------|
| | -0xE
|------|
| | -0x10
|------|
| 0000 | -0x12
|------|
| 0000 | -0x14
|------|
| 0000 | -0x16
|------|
| 0002 | -0x18
|------|
| |
You will notice that assembly provided us with way more stack space than necessary; I am not sure why that much padding is there, but the best answer I can find is here. Please join in on the discussion in the comments if you can explain this further.
Now to the guts of this function!
callMe()
: lines <+16>
- <+21>
<+16>: add eax,edx
<+18>: mov DWORD PTR [rbp-0x4],eax
<+21>: mov eax,DWORD PTR [rbp-0x4]
These instructions are pretty straight forward: add b
and a
and store that value on the stack:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| |
|------|
| |
|------|
| 0000 |
|------|
| 0002 |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP, RSP (callMe)
|------|
| 0000 | -0x2
|------|
| 0002 | -0x4
|------|
| | -0x6
|------|
| | -0x8
|------|
| | -0xA
|------|
| | -0xC
|------|
| | -0xE
|------|
| | -0x10
|------|
| 0000 | -0x12
|------|
| 0000 | -0x14
|------|
| 0000 | -0x16
|------|
| 0002 | -0x18
|------|
| |
We then pull that value back into eax
and move on.
callMe()
: lines <+24>
and <+25>
<+24>: pop rbp
<+25>: ret
This first line is what I mentioned earlier: we want the stack to look the same when we leave a function as it did when we entered. Because we never modified rsp
within callMe()
, it is still pointing to rbp
's old value from when we used it in main()
.
The other “cool” thing about not shifting rsp
on this go around has to do with ret
. When ret
is executed, it pops the top of the stack into rip
. After popping rbp
, rsp
points to main()
's return address. This means that those variables we just stored on the stack within callMe()
will get “wiped out”.
Now I put “wiped out” in quotes because what’s actually happening is that rsp
changes and we lose access to those locals. They don’t get erased though (until rsp
overwrites them), which raises some fun vulnerabilities outside the scope of this tutorial.
As you can see, popping rbp
and ret
urning significantly changes our stack:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
| |
|------|
| |
|------|
| 0000 |
|------|
| 0002 | <-- RSP (main)
|------|
| |
rbp
has been restored to its original (to main()
) value, and rsp
is pointing to the first byte preceding rip
(before it was popped).
Onto the final instructions!
main()
: line <+37>
<+37>: mov DWORD PTR [rbp-0x4],eax
callMe()
passed our answer (2
) back to main()
inside eax
; this instruction simply stores that value on the stack:
| |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
| 0000 | -0x2
|------|
| 0002 | -0x4
|------|
| 0000 |
|------|
| 0002 | <-- RSP (main)
|------|
| |
BOOM!
At this point, we have saved a+b
on the stack and, for the purposes of this tutorial, we are finished. The next set of instructions is setting up to printf()
our calculated value, but we have covered all the fun stuff now.
Conclusions
Drawing the stack can help you debug, explore vulnerabilities, or kill some time; it assists in giving you an empirical understanding of exactly what a program is doing.
This example was simple (yet still required a long-winded post), but you can do this for anything you want.
Carry on!