cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-193) Proactive repair
Date Tue, 23 Jun 2009 16:06:08 GMT


Stu Hood commented on CASSANDRA-193:

> The more I think about this the less convinced I am that the partially-invalidated live
tree is going to be worth the overhead of maintaining it (and initializing it on startup).

There is no need to initialize the tree on startup: it can be done lazily when the first tree

exchange requests come in.

> If you instead just create a mini-merkle tree from the first N keys and exchange that
with the replica nodes, then repeat for the next N, you still get a big win on network traffic
(which is the main concern here)...
Yes, network traffic is important, but the whole point of maintaining the tree in memory is
that it prevents us from having to read entire SSTables from disk in order to do repairs (similar
to BloomFilters for random lookups). Any portions of the tree that survive (which should be
large portions, assuming we do invalidations correctly) mean that we can use the SSTable index
to seek() past chunks of the file.

> but you have no startup overhead, no complicated extra maintenance to perform on insert,
better performance in the worst case and (probably) in the average case, since you are avoiding
random reads in favor of (a potentially greater number of) streaming reads...
 * No startup overhead necessary,
 * B+Tree invalidations will only involve marking a leaf node invalid: aka, do a lookup and
increment a counter,
 * There won't be any random reads... I'm not sure where you read that: in order to validate
regions of the tree we will be iterating over the keys in the CF in sorted order, skipping
regions that are valid.

> Proactive repair
> ----------------
>                 Key: CASSANDRA-193
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
> Currently cassandra supports "read repair," i.e., lazy repair when a read is done.  This
is better than nothing but is not sufficient for some cases (e.g. catastrophic node failure
where you need to rebuild all of a node's data on a new machine).
> Dynamo uses merkle trees here.  This is harder for Cassandra given the CF data model
but I suppose we could just hash the serialized CF value.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message