Bitmap Indexed Storage

Always been facinated by the ways - data structure’s can enable better storage and faster queries.

BitMapped Indexed Storage - is useful while using Columnar Databases - where per column data is stored together.

Often the number of distinct values in a column is small compared to the total number of rows. In the below example: column sex there are only two distinct value Female or Male in the given table - the cardinality is really low.

In such cases: We can now take a column with n distinct values and turn it into n separate bitmaps: one bitmap for each distinct value, with one bit for each row. The bit is 1 if the row has that value, and 0 if not.

Picture is worth a thousand words.. here we you go..

alt Components

Fantastic Query performance… As the amount of data read from disk on to the memory is less - improved Disk Throughput. As always, solution are for a specific problem. If you wanna do a range query, then this solution might not the best.

If n is very small, those bitmaps can be stored with one bit per row. But if n is bigger, there will be a lot of zeros in most of the bitmaps (sparse). In that case, the bitmaps can additionally be run-length encoded or might not be recommened.

Reference: Designing Data-Intensive Applications, by Martin Kleppmann

Jacob Aloysious
Jacob Aloysious
Software Enthusiast

35yr old coder, father and spouse - my interests include Software Architecture, CI/CD, TDD, Clean Code.

Related