You will find that many of us using cassanda are already doing what you suggest (custom serializer/deserializer).

We call it JSON.


I strongly suspect that you're optimizing prematurely.  What evidence
do you have that timestamps are producing unacceptable overhead for
your workload?  

It's possible … this is back of the envelope at the moment as right now it's a nonstarter.  
You do realize that the sparse data model means that
we spend a lot more than 8 bytes storing column names in-line with
each column too, right?

Yeah… this can be mitigated if the column names are your data.

If disk space is really the limiting factor for your workload, I would
recommend testing the compression code in trunk.  That will get you a
lot farther than adding extra options for a very niche scenario.

Another thing I've been considering is building a serializer/deserializer in front of Cassandra and running my own protocol to talk to it which builds its own encoding per row to avoid using excessive columns.




