cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-5351) Avoid repairing already-repaired data by default
Date Tue, 21 May 2013 02:18:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662593#comment-13662593
] 

Jonathan Ellis edited comment on CASSANDRA-5351 at 5/21/13 2:17 AM:
--------------------------------------------------------------------

*All* nodes (including the coordinator) will only have portions repaired in the general case,
since (a) the user can request a repair of an arbitrary range, and (b) even without that,
repairing an entire vnode's range will still leave data from other vnodes unrepaired in the
same sstables.

So the two options that I see are (1) making ranges repaired, rather than sstables, or (2)
"anti-compacting" repaired parts into new sstables.
                
      was (Author: jbellis):
    All nodes will only have portions repaired in the general case, since (a) the user can
request a repair of an arbitrary range, and (b) even without that, repairing an entire vnode's
range will still leave data from other vnodes unrepaired in the same sstables.

So the two options that I see are (1) making ranges repaired, rather than sstables, or (2)
"anti-compacting" repaired parts into new sstables.
                  
> Avoid repairing already-repaired data by default
> ------------------------------------------------
>
>                 Key: CASSANDRA-5351
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
>             Fix For: 2.0
>
>
> Repair has always built its merkle tree from all the data in a columnfamily, which is
guaranteed to work but is inefficient.
> We can improve this by remembering which sstables have already been successfully repaired,
and only repairing sstables new since the last repair.  (This automatically makes CASSANDRA-3362
much less of a problem too.)
> The tricky part is, compaction will (if not taught otherwise) mix repaired data together
with non-repaired.  So we should segregate unrepaired sstables from the repaired ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message