On Sat, Sep 3, 2011 at 8:53 PM, Jonathan Ellis <email@example.com> wrote:I strongly suspect that you're optimizing prematurely. What evidence
do you have that timestamps are producing unacceptable overhead for
your workload?It's possible … this is back of the envelope at the moment as right now it's a nonstarter.You do realize that the sparse data model means that
we spend a lot more than 8 bytes storing column names in-line with
each column too, right?Yeah… this can be mitigated if the column names are your data.
If disk space is really the limiting factor for your workload, I would
recommend testing the compression code in trunk. That will get you a
lot farther than adding extra options for a very niche scenario.
Another thing I've been considering is building a serializer/deserializer in front of Cassandra and running my own protocol to talk to it which builds its own encoding per row to avoid using excessive columns.Kevin--
Location: San Francisco, CA
Skype-in: (415) 871-0687