Null Terminated Programming 101 - x64

Null Terminated Programming 101 - x64

Preface

Everyone, class is in session! please take your seat as we are about to start.

Today we’re going to dive deep into a magnificent assembly language - x64.

I chose this excellent language for this class because of its extreme use in personal computers and because it’s known by most researchers.

When speaking a language, we often use synonyms while speaking about a subject, this makes the conversation more intelligent and exciting for everyone involved.

Even now we just saw the use of them in the way I described x64 as magnifect and excellent. Two different words which mean the same thing.

When writing shellcode, we sometimes need to perform something similar, this is mainly because in some cases, the way we input our shellcode using our exploit will have a limitation of sort.

Some examples:

1.No null bytes in case of strcpy, which stops copying after it encounters a null byte

2.Size limit of on how much shellcode we can put

3.Only alphanumeric characters are copied

These conditions cause us to develop different methods to write our shellcode in a way that it will implement the same logic but still be able to bypass the limitations imposed by the program

Today we will attempt to solve case #1 by learning about techniques that we can use to tackle this issue,
these techniques are often called Null Terminated Programming, as they allow us to compile assembly code that will not contain any null bytes in the final shellcode.

Please note that some of the techniques described here will probably be relevant for different assembly languages as well so i’ll leave it as an exercise for the reader to check these techniques on a different language.

Recommended Prerequisites:

  • x64/x86 assembly knowledge
  • Basic knowledge on building shellcodes

Setup

For our setup, i’m using a script i wrote for compiling shellcode for x86/x64

which we can also use to compile our shellcode.

next, we can use this repository in order to debug and run our shellcode nicely

Quick Refreshment on x64 syscalls

In x64 linux systems, Each syscall has a special number that represents it, when we want to perform a call to a certain syscall we first need to store the correct syscall number in RAX, then we pass the arguments 1-6 to the syscall using the registries RDI, RSI, RDX, RCX, R8, R9 accordingly.

Finally, we use the instruction syscall which performs the syscall itself and stores the return value in RAX.

The full x64 syscall map table can be found here

Let’s have a look at this simple assembly code i wrote named printf_file.asm:


SYS_READ equ 0

SYS_WRITE equ 1

SYS_OPEN equ 2 

SYS_EXIT equ 60

AMOUNT_TO_READ equ 16 

global _start

section .text

_start:

  jmp get_file_path 

  continue:

  

  ; syscall to open the file

  mov eax, SYS_OPEN

  pop rdi ; pop address of string to rdi

  mov rsi, 0 ; set O_RDONLY flag

  syscall

    

  ; syscall to read file

  sub sp, 0xff

  lea rsi, [rsp]

  

  ; syscall to write file contents to stdout  

  mov rdi, rax ; use the returned fd

  mov rdx, AMOUNT_TO_READ; amount to read

  mov rax, SYS_READ

  syscall

  

  ; syscall write to stdout

  mov rdi,1 ; set stdout fd = 1

  mov rdx, rax ; write to stdout the amount of bytes read

  mov rax, SYS_WRITE

  syscall

  mov rax, SYS_EXIT

  syscall ; finish execution

; jump here in order to get the address of the string

get_file_path:

call continue 

file_path: db "/tmp/my_file", 0

This code performs a simple task, it reads 16 bytes from the file located in
/tmp/my_file and outputs those bytes to stdout.

Notice the cool trick we implemented in order to obtain the string that contains the path to the file.
to get that address, we perform a call to a label near that string called get_file_path,
afterwards we immediately perform a second call to the continue label that brings us back to the rest of our shellcode.

Because the second call was invoked, the address to return to after the second call now points to the string, because that is the first “instruction” after the call instruction. we then pop that address to RDI so that RDI (the first param in x64 syscall conventions) can point to the string of the file we wish to open

You can compile it the above assembly code by running


path_to_make_shellcode_ /make_shellcode_linux/make_shellcode.sh/ ./printf_file.asm 64

and run it using


path_to_shellrun/shellrun ./print_file.bin

Let’s check if the shellcode works properly


echo “this_is_my_data” > /tmp/my_file

So far so good.

But under the surface, hids a horrible secret…

It’s full of null bytes!!!

Hexdump just showed us that this shellcode is riddled with null bytes,

Let’s begin our work at curing this code by going over the correct ways to bypass situations where instructions generate null bytes

I will show the opcodes of the instruction on the left side
and the instruction itself on the right side

Note:I’ll be using this website in order to show the bytes generated from the instructions were about to show. I recommend to you all to test your instruction combinations there.

Method 1: Math is awesome

I’ll start of by saying that the mov instruction is many times obsolete when you have the power of math at your side

Loading 0 to a register

Bad way:

Lets look at the following instruction:

48 c7 c0 00 00 00 00   mov rax, 0 

it is 7 bytes long and more importantly, contains 4 null bytes!

we can easily use the following instructions instead

Good way:

48 31 c0    xor rax, rax

48 c7 c0 ff ff ff ff    mov    rax,0xffffffffffffffff
48 ff c0                inc    rax

in case the value of rbx is 0, we can execute this instruction.
(this can also be done with any other register with a 0 value)


48 f7 e3    mul rbx 

The mul instruction will multiply rax with the contents of rbx and store it in rax

because rbx is 0 in this case then 0 will also be stored in rax

**Loading large values to registries **

What about putting large values in registries? For example, if i wanted to read a big file with my shellcode.

Bad way


48 c7 c2 00 00 01 00    mov    rdx,0x10000

Good way

You can use the shift operations in order to load large numbers

48 31 d2               xor    rdx,rdx
48 83 c2 02            add    rdx,0x2
48 c1 e2 0f            shl    rdx,0xf

This will result in rdx having the value 0x10000 at the end of the shift operation.

Method 2: Using your lower parts

Before you start thinking dirty, different parts of each register in x64 can be accessed as an operand.

These parts are mapped in the following way:

This allows us to use the al operand for example instead of the rax operand when we want to perform reading and writing actions on the lower 8 bits of the rax register.

When we do so, the instruction that is executed is much smaller and can also aid us when trying to avoid null bytes.

**Bad way:

48 c7 c0 02 00 00 00    mov    rax,0x2
48 c7 c3 ff 0f 00 00    mov    rbx,0xfff

Good way

48 31 db              xor    rbx,rbx
48 31 c0              xor    rax,rax
b0 02                 mov    al,0x2
66 bbff 0f            mov    bx,0xfff

Field Test

After we learned these two new methods, let’s implement and modify the assembly code we saw at the beginning of the article


SYS_READ equ 0

SYS_WRITE equ 1

SYS_OPEN equ 2

SYS_EXIT equ 60

AMOUNT_TO_READ equ 16

global _start

section .text

_start:

  jmp get_file_path 

  continue:

  

  ; syscall to open the file

  xor rax, rax

  add al, SYS_OPEN

  pop rdi ; pop address of string to rdi

  xor rsi, rsi ; set O_RDONLY flag

  syscall

    

  ; syscall read file

  sub sp, 0xfff

  lea rsi, [rsp]

  mov rdi, rax

  xor rdx, rdx

  add dl, AMOUNT_TO_READ; amount to read

  xor rax, rax

  syscall

    

  ; syscall write to stdout

  xor rdi, rdi

  add dl, 1 ; set fd to point to stdout

  mov rdx, rax

  xor rax, rax

  add al, SYS_WRITE

  syscall

  

  mov al, SYS_EXIT

  syscall ; finish execution

; jump here in order to get the address of the string  

get_file_path:

call continue

flag: db "/tmp/my_file", 0

After we compile this code, we can run it and see that it works exactly the same as the previous code:

let’s see if hexdumps finds any null bytes…

Awesome!

Note: Don’t be confused by the one null byte that hexdump found, that null byte belongs to the string in our shellcode and it’s placed at the end of the shellcode.
It doesn’t seem like it is in the end because memory is saved in little endian.

Conclusions

Today we learned about how we can compile our shellcode to be free of null bytes. We learned along the way about different ways we can perform the same resulting actions using different and sometimes shorter instructions(opcode wise) in x64.

Finally, we used this knowledge to transform shellcode that was riddled with null bytes into one that is ready to tackle any strcpy in it’s path.

I hope you all enjoyed this article and learned more about the x64 instruction set along the way, there are many more methods and techniques yet to learn and I urge you all to keep learning what you don’t know and teach what you do know.

Spread the good word,

x24whoamix24

Sources

https://filippo.io/linux-syscall-table/

https://defuse.ca/online-x86-assembler.htm#disassembly

14 Likes

Good stuff! Playing around with ASM instructions like that to effectively eliminate NULL bytes while keeping the functionality teaches you to think outside the box.

Also finding different solutions for the same problem is the first step to write a polymorphic code generator :smiley:

3 Likes

Thanks rick :smiley:
Nothing more beautiful than code making kids of it’s own

2 Likes

Easy and informative! Well written :slight_smile:

2 Likes