cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charlie Groves (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5351) Avoid repairing already-repaired data by default
Date Fri, 24 May 2013 04:33:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665995#comment-13665995
] 

Charlie Groves commented on CASSANDRA-5351:
-------------------------------------------

bq. So the two options that I see are (1) making ranges repaired, rather than sstables, or
(2) "anti-compacting" repaired parts into new sstables.

I don't believe either of these fix the problem of getting all repairs to nodes other than
the coordinator. If node A is coordinating a repair for ranges held by nodes A, B, and C,
B and C don't attempt to repair each other's ranges. If those ranges were marked as repaired
on all nodes then, B and C would never repair each other.

Maybe the way to fix that is to make first node more of an initiator than a coordinator. The
first node initiates the repair of a given range, and every node getting that request performs
essentially the same repair the coordinator is doing now. That way all the ranges go between
all the involved nodes, and they can safely mark the ranges repaired when all involved nodes
finish. They'd only need to build the merkle tree once per initiated request, so it shouldn't
be any extra work.
                
> Avoid repairing already-repaired data by default
> ------------------------------------------------
>
>                 Key: CASSANDRA-5351
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
>             Fix For: 2.1
>
>
> Repair has always built its merkle tree from all the data in a columnfamily, which is
guaranteed to work but is inefficient.
> We can improve this by remembering which sstables have already been successfully repaired,
and only repairing sstables new since the last repair.  (This automatically makes CASSANDRA-3362
much less of a problem too.)
> The tricky part is, compaction will (if not taught otherwise) mix repaired data together
with non-repaired.  So we should segregate unrepaired sstables from the repaired ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message