From Terje Marthinussen <>
Subject column bloat
Date Tue, 10 May 2011 13:44:57 GMT

If you make a supercolumn today, what you end up with is:
- short  + "Super Column name"
- int (local deletion time)
- long (delete time)
Byte array of  columns each with:
  - short + "column name"
  - int (TTL)
  - int (local deletion time)
  - long (timestamp)
  - int + "value of column"

That is, meta data and serialization overhead adds up to:
2+4+8 = 14 bytes for the supercolumn
2+4+4+8+4 = 22 bytes for each column the supercolumn have

Yes, disk space is cheap and all that, but trying to handle a few billion
supercolumns which each have some 30-50 subcolumns, I am looking at some
1.2-1.5TB of meta data which makes the metadata by itself some 3-4 times the
orginal data. That does seem a bit excessive when you also throw in RF=3 and
the requirement for extra diskspace to safely survive compactions.

And yes, this is without considering the overhead of column names.

I can see a handful of way to reduce this quite a bit, for instance by:
- not adding TTL/deletion time if not needed (some compact bitmap structure
to turn on/off fields?)
- inherit timestamps from the supercolumn

There may also be some interesting ways to compress this data assuming that
the timestamps are generally in the same time areas (shared "prefixes"
for instance) , but that gets a bit more complex.

Any opinions or plans?
Sorry, I could not find any JIRA's on the topic, but I guess I am not
surprised if it exists.

Yes, I could serialize this myself outside of cassandra, but that would sort
of defeat the purpose of using a more advanced storage system like


