incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin <colpcl...@gmail.com>
Subject Re: Not all data structures need timestamps (and don't require wasted memory).
Date Sun, 04 Sep 2011 15:22:15 GMT
Kevin,

You will find that many of us using cassanda are already doing what you suggest (custom serializer/deserializer).

We call it JSON.

--
Colin

*Sent from Star Trek like flat panel device, which although larger than my Star Trek like
communicator device, may have typo's and exhibit improper grammar due to haste and less than
perfect use of the virtual keyboard*
 

On Sep 4, 2011, at 12:11 AM, Kevin Burton <burton@spinn3r.com> wrote:

> 
> 
> On Sat, Sep 3, 2011 at 8:53 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> I strongly suspect that you're optimizing prematurely.  What evidence
> do you have that timestamps are producing unacceptable overhead for
> your workload?  
> 
> It's possible … this is back of the envelope at the moment as right now it's a nonstarter.
 
>  
> You do realize that the sparse data model means that
> we spend a lot more than 8 bytes storing column names in-line with
> each column too, right?
> 
> Yeah… this can be mitigated if the column names are your data.
>  
> 
> If disk space is really the limiting factor for your workload, I would
> recommend testing the compression code in trunk.  That will get you a
> lot farther than adding extra options for a very niche scenario.
> 
> 
> Another thing I've been considering is building a serializer/deserializer in front of
Cassandra and running my own protocol to talk to it which builds its own encoding per row
to avoid using excessive columns.
> 
> Kevin 
> 
> -- 
> Founder/CEO Spinn3r.com
> 
> Location: San Francisco, CA
> Skype: burtonator
> Skype-in: (415) 871-0687
> 

Mime
View raw message