cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3200) Repair: compare all trees together (for a given range/cf) instead of by pair in isolation
Date Wed, 14 Sep 2011 07:38:09 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104324#comment-13104324
] 

Peter Schuller commented on CASSANDRA-3200:
-------------------------------------------

This is definitely an interesting idea. But FWIW, I think it is more important to make repair
be more incremental/less bulky/more continuous than it is to be efficient in terms of absolute
amount of data transfered. I wonder to what extent an implementation of this ticket might
be obsoleted by a solution to CASSANDRA-2699 (not that the desire to not transfer things unnecessarily
goes away, but in terms of the implementation details).

> Repair: compare all trees together (for a given range/cf) instead of by pair in isolation
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3200
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3200
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: repair
>             Fix For: 1.0.1
>
>
> Currently, repair compare merkle trees by pair, in isolation of any other tree. What
that means concretely is that if I have three node A, B and C (RF=3) with A and B in sync,
but C having some range r inconsitent with both A and B (since those are consistent), we will
do the following transfer of r: A -> C, C -> A, B -> C, C -> B.
> The fact that we do both A -> C and C -> A is fine, because we cannot know which
one is more to date from A or C. However, the transfer B -> C is useless provided we do
A -> C if A and B are in sync. Not doing that transfer will be a 25% improvement in that
case. With RF=5 and only one node inconsistent with all the others, that almost a 40% improvement,
etc...
> Given that this situation of one node not in sync while the others are is probably fairly
common (one node died so it is behind), this could be a fair improvement over what is transferred.
In the case where we use repair to rebuild completely a node, this will be a dramatic improvement,
because it will avoid the rebuilded node to get RF times the data it should get.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message