cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burton <bur...@spinn3r.com>
Subject Memory overhead of vector clocks…. how often are they pruned?
Date Wed, 24 Aug 2011 02:58:06 GMT
I had a thread going the other day about vector clock memory usage and that
it is a series of (clock id, clock):ts and the ability to prune old entries
… I'm specifically curious here how often old entries are pruned.

If you're storing small columns within cassandra.  Say just an integer.  The
vector clock overhead could easily use up far more data than is actually in
your database.

However, if they are pruned, then this shouldn't really be a problem.

How much memory is this wasting?

Thoughts?


    Jonathan Ellis jbellis@gmail.com to user
 show details Aug 19 (4 days ago)
 The problem with naive last write wins is that writes don't always
arrive at each replica in the same order.  So no, that's a
non-starter.

Vector clocks are a series of (client id, clock) entries, and usually
a timestamp so you can prune old entries.  Obviously implementations
can vary, but to pick a specific example, Voldemort [1] uses 2 bytes
per client id, a variable number (at least one) of bytes for the
clock, and 8 bytes for the timestamp.

[1]
https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/versioning/VectorClock.java


-- 

Founder/CEO Spinn3r.com

Location: *San Francisco, CA*
Skype: *burtonator*

Skype-in: *(415) 871-0687*

Mime
View raw message