Tuesday, February 21, 2012

C Language: Struct packing and member alignment.

I know this has been discussed and re-discussed at so many locations and so many times that me writing about it again wont make much of a difference but I still see so many people confused about this that I cant stop my self from writing about it.

By default C compilers properly align each member of struct. This means that a 2-byte member e.g. short is aligned on 2-byte boundary, a 4-byte member e.g. int is aligned on 4-byte boundary and so on. This is done so that the struct members can be accessed efficiently and to reduce cache misses. To ensure proper alignment compilers add padding bytes to structs. Consider for example the following struct:
struct SomeData
{
    char Data1;
    short Data2;
    int Data3;
    char Data4;
};

Now as you can can see we are only using 8 bytes for data, but the compiler will add padding bytes to this struct to ensure proper alignment of its member. So a variable of this struct type will actually be like this after compilation on a 32-Bit machine:
struct SomeData
{
    /* 1 byte */
    char Data1; 
    /* 1 byte so the following 'short' can be aligned on a 2 byte boundary*/
    char Padding1[1]; 
    /* 2 bytes */
    short Data2;
    /* 4 bytes - largest struct member */
    int Data3;
    /* 1 byte */
    char Data4;
    /* 3 bytes to make total size of the struct 12 bytes */
    char Padding2[3];
};

The compiled size of the structure is now 12 bytes. It is important to note that the last member is padded with the number of bytes required to make total size of the struct a multiple of the size of the largest member of a struct. In this case 3 bytes are added to the last member to pad the struct to the size of a 12 bytes (4 bytes of int × 3).

As you can see, we are wasting memory. We can off course stop compiler from doing this by asking compiler to pack structs tightly (using #pragma pack() with gcc) but then we lose performance benefits of proper alignment.

In order to avoid these padding bytes but still have proper alignment, we can rearrange this struct so that larger members are listed before the smaller ones. So the above struct can be rearranged as:
struct SomeData
{
    int Data3;
    short Data2;
    char Data1;
    char Data4;
};
Now this does not require any padding as every element is already properly aligned and the overall size of the struct is a multiple of the largest member i.e 4*2 = 8. This is off-course just an example and in reality, even if you arrange your struct members this way, there will still be some padding at the end of the struct to make total size of the struct a multiple of the size of the largest struct member
So please always arrange members of your structs in decreasing order of their size.

No comments: