cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: memory overhead of vector clocks vs timestamps and running *without* either to save memory?
Date Sat, 20 Aug 2011 02:38:47 GMT
The problem with naive last write wins is that writes don't always
arrive at each replica in the same order.  So no, that's a
non-starter.

Vector clocks are a series of (client id, clock) entries, and usually
a timestamp so you can prune old entries.  Obviously implementations
can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
per client id, a variable number (at least one) of bytes for the
clock, and 8 bytes for the timestamp.

[1] https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java

On Fri, Aug 19, 2011 at 7:41 PM, Kevin Burton <burton@spinn3r.com> wrote:
> I have a few questions which I can't seem to find answers to...
> I know that the memory overhead of timestamps is 8 bytes per row/column.
>   What is the memory overhead of vector clocks?
> Is it possible (at least in theory) to run without timestamps on your
> values?  I'm fine with last writer wins semantics and the memory overhead
> here seems very high.
> I understand that the timestamps might also be for repartitioning which adds
> complexity.
> Thanks!
>
> --
>
> Founder/CEO Spinn3r.com
>
> Location: San Francisco, CA
> Skype: burtonator
>
> Skype-in: (415) 871-0687
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message