The key to writing an exploit is to understand what you actually have to modify to get the program to execute your instructions. This involves working closely with the stack, and architecture of the system “assembly” in order for the exploitation process to take place, However just knowing assembly is not enough, Although it is extremely helpful to know C, The intricacies of exploit development goes beyond understanding the technical aspects. It also requires creativity, and the ability to analyze and discover new vulnerabilities, exploit development is no easy task and often requires advanced techniques and knowledge which make it hard for lone attackers to end up with a working exploit.
Overflows Exploitation
So, let’s start with some fundamentals first Memory
So the principle of exploiting a buffer overflow is to overwrite parts of memory which aren’t supposed to be overwritten by arbitrary input and making the process execute this code. To see how and where an overflow takes place, lets take a look at how memory is organized
code segment
, data in this segment are assembler instructions that the processor executes. The code execution is non-linear, it can skip code, jump, and call functions on certain conditions. Therefore, we have a pointer called EIP, or instruction pointer. The address where EIP points to always contains the code that will be executed next.
data segment
, space for variables and dynamic buffers.
stack segment
, which is used to pass data (arguments) to functions and as a space for variables of functions. The bottom (start) of the stack usually resides at the very end of the virtual memory of a page, and grows down. The assembler command PUSHL will add to the top of the stack, and POPL will remove one item from the top of the stack and put it in a register. For accessing the stack memory directly, there is the stack pointer ESP that points at the top (lowest memory address) of the stack.
Next, Let’s take a look at an example of a simple function written in assembly, A function is a piece of code in the code segment, that is called, performs a task, and then returns to the previous thread of execution. Optionally, arguments can be passed to a function.
memory address code
0x8054321 <main+x> pushl $0x0
0x8054322 call $0x80543a0 <function>
0x8054327 ret
0x8054328 leave
...
0x80543a0 <function> popl %eax
0x80543a1 addl $0xf00,%eax
0x80543a4 ret
What’s going on here? The main function calls, The variable is 0
, main pushes it onto the stack, and calls the function. The function gets the variable from the stack using popl
. After finishing, it returns to 0x80543a0
. Commonly, the main function would always push register EBP
on the stack, which the function stores, and restores after finishing. This is the frame pointer concept, that allows the function to use own offsets for addressing, which is mostly uninteresting while dealing with exploits.
We just have to know what the stack looks like. At the top, we have the internal buffers and variables of the function. After this, there is the saved EBP
register (32 bit, which is 4 bytes), and then the return address, which is again 4 bytes
. Further down, there are the arguments passed to the function, which are uninteresting to us. In this case, our return address is 0x8054327
. It is automatically stored on the stack when the function is called. This return address can be overwritten, and changed to point to any point in memory, if there is an overflow somewhere in the code.
what do you think is generally more safe, a program dynamically linked to its libraries or one statically linked to them?
void foo (void) {
char small[30];
gets (small);
printf("%s\n", small);
}
int main() {
foo();
return 0;
}
Overflowing the program
# ./f00
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# ./f00
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Segmentation fault (core dumped)
# gdb f00 core
(gdb) info registers
eax: 0x24 36
ecx: 0x804852f 134513967
edx: 0x1 1
ebx: 0x11a3c8 1156040
esp: 0xbffffdb8 -1073742408
ebp: 0x787878 7895160
^^^^^^
EBP
is 0x787878
, this means that we have written more data on the stack than the input buffer could handle. 0x78 is the hex representation of ‘x’. The process had a buffer of 32 bytes maximum size. We have written more data into memory than allocated for user input and therefore overwritten
EBP
and the return address with ‘xxxx’, and the process tried to resume execution at address 0x787878
, which caused it to get a segmentation fault.
Heap vs Stack based overflows
Dynamically allocated variables (those allocated by malloc();
are created on the heap. Unlike the stack, the heap grows upwards on most systems, that is, new variables created on the heap are located at higher memory addresses than older ones. In a simple heap-based buffer overflow attack, an attacker overflows a buffer that is lower on the heap, overwriting other dynamic variables, which can have unexpected and unwanted effects.
Alternatively, the stack starts at a high memory address and forces its way down to a low memory
address. The actual placement of replacement on the stack are established by the commands
PUSH
AND POP
, respectively. A value that is Push’ed on to the stack is copied into the memory
location and is pointed to as execution occurs by the stack pointer. The stack pointer will then be decremented as the stack sequentially moves down, making room for the next local variables to be added subl $20,%esp
. POP
is the reverse of such an event.
Stack based are relatively simple in terms of concept, these include functions such as:
strcat()
, sprint()
, strcpy()
, gets()
, etc…
anywhere where unchecked variables are placed into a buffer of fixed length. Buffer overflows can be avoided by using safe alternatives such as snprintf()
with the appropriate size parameter, denoted by ‘n’. showing that the ‘n’ creates the size we want to copy to the buffer, in this instance it’s the complete buffer size, so we don’t go over and create the unwanted overflow, and ultimately execute unwanted arbitrary data.
End
That’s all for now, I hope you learned something, this article was a quick introduction to the fundamentals of overflows exploitation. The key takeaway is that understanding system architectures and having both fast and slow thinking abilities are essential, Experience plays a significant role in this process. However, if you have a good grasp of the basics, analyzing and breaking down systems won’t be a challenge for you.