cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Terje Marthinussen <tmarthinus...@gmail.com>
Subject Re: Using 5-6 bytes for cassandra timestamps vs 8…
Date Mon, 29 Aug 2011 08:51:10 GMT
I have a patch for trunk which I just have to get time to test a bit before I submit.

It is for super columns and will use the super columns timestamp as the base and only store
variant encoded offsets in the underlying columns. 

If the timestamp equals that of the SC, it will store nothing (just set a bit in the serialization
flag).

This could be further extended somehow to sstables or rows so there is a base time per sstable
or row and just variant encoded offsets of that per column.

Terje

On Aug 29, 2011, at 3:58 PM, Kevin Burton wrote:

> I keep thinking about the usage of cassandra timestamps and feel that for a lot of applications
swallowing a 2-4x additional cost to to memory might be a nonstarter.
> 
> Has there been any discussion of using alternative date encodings?
> 
> Maybe 1ms resolution is too high ….. perhaps 10ms resolution?  or even 100ms resolution?
> 
> Using 4 bytes and 100ms resolution your can fit in 13 years of timestamps if you use
the time you deploy the cassandra DB (aka 'now') as epoch.
> 
> Even 5 bytes at 1ms resolution is 34 years.  
> 
> That's 37% less memory!  
> 
> In most of our applications, we would NEVER see concurrent writers on the same key because
we partition the jobs so that this doesn't happen.
> 
> I'd probably be fine with 100ms resolution.
> 
> Allowing the user to tune this would be interesting as well.
> 
> -- 
> Founder/CEO Spinn3r.com
> 
> Location: San Francisco, CA
> Skype: burtonator
> Skype-in: (415) 871-0687
> 


Mime
View raw message