[C] Dynamic Memory (Part 6.0): A Closer Look at [My]Malloc

oaktree · October 13, 2016, 11:59pm

It’s been a while since the last edition (I feel like I say this every time). What you’re looking at is the sixth installment of my [C] Dynamic Memory Series.

I’ll start this paper by enumerating some prerequisites:

Proficiency in C
Understanding of the Stack vs Heap
Understanding of sbrk and brk.
Familiarity with this article’s predecessors (all of them).

As always, the above is merely a suggestion.

One last thing: What follows is more a writeup of my findings and experience than a tutorial.

Adapting the headers

Before I introduce the actual code, it is paramount that we take a look at the headers.

/*
    my-malloc.h
*/
#include <stddef.h>

#ifndef MY_MALLOC_HDR
#define MY_MALLOC_HDR

typedef enum _free_t {
    NOT_FREE, FREE
} _free_t;

typedef struct _chunk {
    size_t  _chunk_sz;

    struct _chunk *next,
                  *prev,
                  *next_free,
                  *prev_free;


    _free_t _free;

} _chunk;

typedef struct _mem_session {
    _chunk *_first_chunk,
           *_last_chunk,
           *_first_free_chunk,
           *_last_free_chunk;

    size_t _chunks_allocated;

} _mem_session;

/* ... */

#endif

I’m using an enum (Google if necessary) to denote whether or not a chunk is free or not.

The main thing that has changed in this code is the way free chunks are handled in memory. In addition to having a linked list for all of the chunks, there will also be a linked list exclusively for free chunks. Both of the above struct declarations have been updated in accordance with this new concept.

The logic of `my_malloc()`

Function prototype:

void* my_malloc(size_t bytes);

My implementation takes a single parameter, bytes, denoting the number of bytes the user has requested. This number bytes is rounded up to the nearest multiple of 4, which is the standard size of an int. At the top of the routine, bytes is modified to be a multiple of four.

The inaugural call to my_malloc does one extra step before proceeding to allocate memory for the user. It calls a special function called _create_session() that sets up a _mem_session. This _mem_session, which we will refer to as “session” from now on, has valuable, useful metadata about the heap. Included in the session data are the linked list pointers (all and free-only) as well as a size_t that keeps track of the number of chunks in the session. This session consumes 40 bytes of memory (four pointers and a size_t use that much on my machine). Not bad, eh?

Once the session is created (and it is only created during the first call to my_malloc(), the function tries to find some memory for the user. The first step in doing so is scanning the freed chunks linked list for any available chunks large enough to hold at least bytes bytes of memory.

If a free chunk is found, my_malloc() will prepare to return this memory to the user. First, however, my code will attempt to cut the chunk into two pieces following this logic: if there is enough space such that _chunk_sz is greater than the sum of bytes, the size of a chunk, and another 4, split the chunk in two.

There is always the possibility that no free chunk large enough to store bytes memory is found. If such is the case, my_malloc uses sbrk() to expand the program break by the sum of the size of a chunk and bytes.

Whether an existing chunk was reused or a new one was made, the end result is the same. my_malloc() returns a pointer offset. The user data begins at a sizeof(_chunk) offset from the pointer to the beginning of the chunk. This is what is returned.

Note: If something does fail during the my_malloc() call, (void*)NULL is returned.

Coding out `my_malloc()`

void *my_malloc(size_t bytes) {

    if (!_session && _create_session() < 0) {
        return NULL;
    }

    /* Round up to nearest multiple of 4. */
    size_t cnt = (((bytes-1)>>2)<<2) + 4;

    _chunk *c = _find_free_chunk( cnt );

    if ( !c ) {

        /*  we will extend the program break if we can't
            find a viable chunk at this time. */

        void *ptr;
        if ( (ptr = sbrk(sizeof(_chunk) + cnt)) == (void*)(-1) ) {
            fprintf(stderr, "Couldn't allocate memory!!\n");
            return NULL;
        }

        c = (_chunk*)ptr;
        c->_chunk_sz = cnt;
        c->_free = NOT_FREE;

        c->prev = _session->_last_chunk;
        c->next = NULL;

        c->prev_free = NULL;
        c->next_free = NULL;

        if ( !_session->_first_chunk )
            _session->_first_chunk = c;

        /*
            The current last chunk should have its
            `next` attr point to c before we make the
            last chunk in the session point to c.
        */
        if ( _session->_last_chunk )
            _session->_last_chunk->next = c;

        _session->_last_chunk = c;

        _session->_chunks_allocated++;

    } else {

        c->_free = NOT_FREE;

        /* patch free chunks */
        if (c->next_free && c->prev_free) {
            /* runs if c is in the middle of the freed chunks list */
            c->next_free->prev_free = c->prev_free;
            c->prev_free->next_free = c->next_free;

        } else if (c->next_free) {
            /* runs if c is the first of the freed chunks list */
            c->next_free->prev_free = NULL;

        } else if (c->prev_free) {
            /* runs if c is the last of the freed chunks list */
            c->prev_free->next_free = NULL;
        }

        if (_session->_first_free_chunk == c)
            _session->_first_free_chunk = c->next_free;

        if (_session->_last_free_chunk == c)
            _session->_last_free_chunk = c->prev_free;

        /* end patching of free chunks */

    /* -------------------------------------------- */

        /* Can we shrink the chunk?
           If so, let's create a chunk to handle the rest
           of the memory, so that we don't take more than
           we truly need. */
        if (c->_chunk_sz >= cnt + (sizeof(int) + sizeof(_chunk) )) {

            _chunk *d = (_chunk*)((char*)c + sizeof(_chunk) + c->_chunk_sz);

            size_t extra = c->_chunk_sz - cnt - sizeof(_chunk);
            c->_chunk_sz = cnt;

            d->_chunk_sz = extra;
            d->_free = FREE;

            if ( !_session->_first_free_chunk )
                _session->_first_free_chunk = d;

            if ( _session->_last_free_chunk )
                _session->_last_free_chunk->next_free = d;

            d->prev_free = _session->_last_free_chunk;
            _session->_last_free_chunk = d;
            d->next_free = NULL;

            d->next = c->next;
            c->next = d;
            d->prev = c;

            _session->_chunks_allocated++;
        }
    }

    /* return `c` offset by the size of a chunk,
       resulting in a pointer pointing to the
       beginning of user data space*/
    return (sizeof(_chunk) + (char*)c);
}

There are a few things to take note of:

_find_free_chunk( cnt ) is a subroutine I wrote out that… finds free chunks. It, as you would think, finds a suitable free chunk if there is one, or returns NULL.

What takes place in the if ( !c ) block is the extending of the program break for a new chunk. It initializes the chunk’s metadata appropriately and links the chunk into the session’s linked lists.

The else block runs when we do find a free chunk. The found chunk is flagged as NOT_FREE. The chunk is removed from the free-only linked list. Then, if possible, the chunk is split up into two smaller chunks to save excess memory for usage elsewhere. The splitting involves some metadata initialization (like setting the size of chunk d and adjusting the size of chunk c). In addition, some linked list logic is done to ensure that the session knows about the new chunk.

As mentioned above, the last thing to happen is the returning of the pointer to the user data.

Conclusion

After reading this article, read it again. Ensure that you understand the code. Post a question if you must.

Then, I invite you to think of the upsides and downsides of this implementation. For example: how can we ensure that the user doesn’t write into the next chunk? do we really need the doubly-linked list stuff to manage the chunks properly? is there another way? should we manage the chunks by size for quicker access? should we pay more attention to allocation speed, rather than space-efficiency? what happens when we run out of memory?

The solutions to some of the problems listed above may already be clear to you. If so, post a suggestion! I have a few ideas in mind, as well, but I’m eager to hear your thoughts.

_py · November 29, 2016, 4:13pm

Sweet stuff.

I was revising a lil’ about heaps so I read through your series once again but this time I used google as well. I discovered the so called fastbins and binlists which seemed interesting to me. Apparently, free()'s algo acts differently according to the size of the malloc’d chunk. There are both doubly and singly linked lists implemented. Looks like the the whole exploit scene comes into play thanks to free()'s implementation (in particular, thanks to *next_free, *prev_free in your _chunk struct definition). Out of curiosity, since there is a bromance between malloc() and free(), will you be covering any of free()'s magic in the future?

oaktree · November 29, 2016, 7:51pm

Yup. Free is next up.

mcus · November 23, 2017, 3:10pm

Great parts even after a year.

I have a question.
Do we need to make our on chunks structures and etc. for the heap ?
I mean aren’t there some function that get us the lists of chunks or something like this.

P.S. Sorry of my lack of knowledge, I am new to this.

oaktree · November 23, 2017, 5:41pm

@mcus,

This series is about implementing a malloc and free… We need the data structures so that we can organize and manipulate the memory of the free store.

Let’s say that we don’t use chunks, or any sort of other data structure arrangement, and we just use the brk or sbrk system calls. The reuse of previously-allocated memory would be near impossible to “safely” implement. It’d be up to the end programmer (the user of our malloc function family) to know what memory to overwrite. That makes life extremely hard.

The goal of the series was to build a malloc from the ground up. Maybe you’re confusing chunks with paging or segmentation?

mcus · November 24, 2017, 8:49am

I was asking this question because I have simple project where I need to get the all chunks of memory on the heap and iterate through them.
In this project I have implemented my malloc etc.

So lets say some allocated memory and on that space he write some cookie let say “0xdeadface” and that can be done a lot of times, so someone else somewhere in the program maybe overwrite that cookie. So I want to get all chunks and see where we have “0xdeadface” and I want to this on the heap.

So I had idea to get the range of the heap but I want to do this on iPhone.

Is there a way to do this? Is there a way to get the memory location that have already something?

oaktree · November 24, 2017, 1:40pm

There are many reversing tools that let you find certain values in memory. Is that what you meant? If this is the case though, why deal with malloc chunks at all?

mcus · November 24, 2017, 3:16pm

Just doing a little tweak on iPhone and for curiosity.

Thank you for reply.

oaktree · November 24, 2017, 7:12pm

I don’t think any of the knowledge in my series will help you with accessing the heap of other running processes on any platform.

oaktree · January 21, 2018, 12:30am

This topic was automatically closed after 30 days. New replies are no longer allowed.

[C] Dynamic Memory (Part 6.0): A Closer Look at [My]Malloc

Adapting the headers

The logic of my_malloc()

Coding out my_malloc()

Conclusion

The logic of `my_malloc()`

Coding out `my_malloc()`