incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burton <bur...@spinn3r.com>
Subject Re: Not all data structures need timestamps (and don't require wasted memory).
Date Sun, 04 Sep 2011 05:11:27 GMT
On Sat, Sep 3, 2011 at 8:53 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> I strongly suspect that you're optimizing prematurely.  What evidence
> do you have that timestamps are producing unacceptable overhead for
> your workload?


It's possible … this is back of the envelope at the moment as right now it's
a nonstarter.


> You do realize that the sparse data model means that
> we spend a lot more than 8 bytes storing column names in-line with
> each column too, right?
>

Yeah… this can be mitigated if the column names are your data.


>
> If disk space is really the limiting factor for your workload, I would
> recommend testing the compression code in trunk.  That will get you a
> lot farther than adding extra options for a very niche scenario.
>
>
Another thing I've been considering is building a serializer/deserializer in
front of Cassandra and running my own protocol to talk to it which builds
its own encoding per row to avoid using excessive columns.

Kevin

-- 

Founder/CEO Spinn3r.com

Location: *San Francisco, CA*
Skype: *burtonator*

Skype-in: *(415) 871-0687*

Mime
View raw message