Bit Packing

Bit packing is a space-optimization technique that stores multiple data values (like integers or booleans) into a single byte or word by using only the exact number of bits needed, rather than standard data type size.

I think Kevin also told me how he used bit packing at work.

But bit packing can be kind of inefficient no? because the compiler is made to read Words at time, and doing bit shifts can be slow?

I was going through this mental exercise. Consider the following struct (unpacked):

struct Data {
    bool a;
    bool b;
    bool c;
};

vs. (packed)

struct Data {
    byte data; // a, b, c
};

Now, reason about reads and writes and performance.

For reads:

Data d; // unpacked
cout << d.a << d.b << d.c << endl;
 
Data d; // packed
cout << (d.data >> 2) & 1
    << (d.data >> 1) & 1
    << d.data & 1 << endl;

Unpacked requires 3 different reads (d.a, d.b, d.c)
Packed only reads 1 time (`d.data)

However, those 3 different reads are generally negligible.

For writes:

Data d; // unpacked
d.a = a;
d.b = b;
d.c = c;
 
Data d; // packed
d.data = (a << 2) | (b << 1) | c;

Same idea

The bigger advance is when we have many of these. We can drastically reduce the memory bandwidth. And use bit shifts which are pretty cheap to get the data.

So when is bit packing bad?

When the overhead of the bit manipulation becomes higher than the memory bandwidth, for example if you do lots of random writes or frequent updates.

It also makes SIMD harder.

🛠️ Steven Gong

Bit Packing

Graph View

Backlinks