How is C Structure Layout in Memory

n33ds0n · June 24, 2021, 7:00pm

I understand that array is a data structure that store collection of data of the same type in a contigous memory and I also understand that structure is a user define data type that group element of the same or different kind into a single entity, but my question is how is structure handle in memory(structure layout in memory)

jeff · June 24, 2021, 9:13pm

Just like they appear in your source code, with a possible padding/alignement

that is if your code looks like this

struct test{
    int a;
    char b;
    long c;
    char d;
}

assuming the struct will be stored at address X in memory, and this code is on a 32 bit machine (meaning sizeof(int) = sizeof(long) = 4, and sizeof(char) = 1 , at least in my machine)

a will be stored at address X and has the size of 4 bytes , b will be at address X + 4 in 1 byte, with a padding of 3 bytes (sizeof(int) - sizeof(char)) , c will be stored at address X + 8 and d will be stored at address X + 12, the whole struct will have a padding of 3 bytes such as sizeof(struct test) = 16

there could be some cases where char variables are stored at any byte of the word they’re located at, in which cases the offset and the padding would slightly differ, but this topic is already discussed to death in the link above, cheers

n33ds0n · June 24, 2021, 11:20pm

Thanks @jeff I really appreciate, I have been wandering how, thank you so much♥️

c0z · June 26, 2021, 6:55am

A continuation on what @jeff was saying.

Now before we go into this, it’s worth mentioning the resources already on 0x00sec about memory alignment. Now memory alignment is the padding of bytes to allow instructions and memory locations to be a multiple of the data-bus width of the CPU’s data-bus for consumption. From wikipedia,

Data alignment is the aligning of elements according to their natural alignment. To ensure natural alignment, it may be necessary to insert some padding between structure elements or after the last element of a structure. For example, on a 32-bit machine, a data structure containing a 16-bit value followed by a 32-bit value could have 16 bits of padding between the 16-bit value and the 32-bit value to align the 32-bit value on a 32-bit boundary. Alternatively, one can pack the structure, omitting the padding, which may lead to slower access, but uses three quarters as much memory.

On a 64 bit kernel but with an application compiled as 32 bit, we see the usual sizes:

#include <stdio.h>

struct __struct_a {
    float x;
    float y;
    char* p;
    int size;
};

int main(void) {
    struct __struct_a a;
    printf("int: %d\nfloat: %d\nchar*: %d\nstruct_a: %d\n",sizeof(int),sizeof(float),
            sizeof(char*),sizeof(a));
    return 0;
}

This outputs:

$ ./sizes                            
int: 4
float: 4
char*: 4
struct_a: 16

We can see that if we add together the byte sizes of the data types in our struct they equal what’ it’s composed of. Float (4), float (4), char* (4), and __struct_a (4) which equals 16 bytes. Now let’s look at a modification:

struct __struct_a {
    float x;
    float y;
    char* p;
    int size;
    char a;
};

This is the output:

int: 4
float: 4
char*: 4
struct_a: 20
char: 1

Wait what? Adding a 1 byte data type at the end of the data structure makes the compiler pad the data structure by three bytes. Why is this? Well it’s so that because this is a 32 bit binary we need a 4 byte boundary alignment. Let’s see what will happen if we move around the 1 byte char to see if the size of our struct decreases.

struct __struct_a {
    float x;
    float y;
    char a;
    int size;
    char* p;
};

The output:

int: 4
float: 4
char*: 4
char: 1
struct_a: 20

Well that’s kind of expected, but what about saving space? How can we be efficient? Well we can use the __attribute__((packed)) to tell the compiler that we want the data not naturally aligned. Let’s check it out.

struct __attribute__((packed)) __struct_a {
    float x;
    float y;
    char a;
    int size;
    char* p;
};

The output:

int: 4
float: 4
char*: 4
char: 1
struct_a: 17

Hmm, okay so now the struct is the expected 17 bytes but why do we want packed structs, are they good or bad? Well it’s bad in terms of performance and recovering from potential errors. The reason why are those 17 bytes now throw off the CPU is that the CPU is expecting 4 byte aligned data. When the CPU gets a non-aligned piece of memory it may have to fetch another memory address to get the rest of the data structure from memory since it might be in a different page or just might have fetched 16 bytes, and now needs to fetch the next 4 bytes to get that one char. This is mentioned here

This is a look at the packed version of the structure of the program.

Now with the unpacked version of the structure:

We see the 20 bytes (0x14) of space that gets pushed onto the stack versus the 11 (0x11), but besides that there’s virtually no difference in both programs. So then that means it ultimately comes down to performance and performance is why the GNU’s compiler directive __attribute__((packed)) exists, for specific use cases like drivers.

messede · June 26, 2021, 7:54am

this is exactly what this thread needed , thankyou

n33ds0n · June 26, 2021, 5:11pm

Thank you @c0z for the the addition, really love you guys explation

system · October 24, 2021, 11:00am

This topic was automatically closed after 121 days. New replies are no longer allowed.