cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2610) Have the repair of a range repair *all* the replica for that range
Date Wed, 17 Oct 2012 16:48:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478019#comment-13478019
] 

Sylvain Lebresne commented on CASSANDRA-2610:
---------------------------------------------

I think that you may want to look at -snapshot (CASSANDRA-3721). It sequentialize both the
merkle tree and the streaming. In other words, if you have 3 replicas A, B and C for the range,
A will compute its merkle tree, then B, then C, and the same for the streaming phase. This
exists exactly to avoid the "all replica for the range are slow because they all doing repair
stuff".
                
> Have the repair of a range repair *all* the replica for that range
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-2610
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2610
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: 0001-Make-repair-repair-all-hosts.patch, 0001-Make-repair-repair-all-hosts-v2.patch,
0002-Cleanup-log-messages-v2.patch, 0003-cleanup-and-fix-private-reference.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Say you have a range R whose replica for that range are A, B and C. If you run repair
on node A for that range R, when the repair end you only know that A is fully repaired. B
and C are not. That is B and C are up to date with A before the repair, but are not up to
date with one another.
> It makes it a pain to schedule "optimal" cluster repairs, that is repairing a full cluster
without doing work twice (because you would have still have to run a repair on B or C, which
will make A, B and C redo a validation compaction on R, and with more replica it's even more
annoying).
> However it is fairly easy during the first repair on A to have him compare all the merkle
trees, i.e the ones for B and C, and ask to B or C to stream between them whichever the differences
they have. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message