cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-5351) Avoid repairing already-repaired data by default
Date Fri, 20 Sep 2013 16:47:55 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13773150#comment-13773150
] 

Jonathan Ellis edited comment on CASSANDRA-5351 at 9/20/13 4:46 PM:
--------------------------------------------------------------------

bq. The more often you repair the less big a full "separate set of levels for unrepaired data"
would be. So maybe that's the way to go.

Which is to say, we'd be kicking repairs off as automatically as we currently kick off compaction.

I still don't have any better ideas.  [~krummas]?
                
      was (Author: jbellis):
    bq. I think it would be simpler to anticompact after repair

This is straightforward for STCS (bucket repaired/non-repaired separately) but less so for
LCS.

Now that we're already doing STCS in L0, I suggest extending that here: reserve the levels
for repaired data, and STCS until we can repair.

This implies making repair as automatic as compaction, which is a big change for us.  I think
it's a lot more user friendly, but I'm not 100% confident the performance impact will be negligible.
 Any better ideas?
                  
> Avoid repairing already-repaired data by default
> ------------------------------------------------
>
>                 Key: CASSANDRA-5351
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5351
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Lyuben Todorov
>              Labels: repair
>             Fix For: 2.1
>
>
> Repair has always built its merkle tree from all the data in a columnfamily, which is
guaranteed to work but is inefficient.
> We can improve this by remembering which sstables have already been successfully repaired,
and only repairing sstables new since the last repair.  (This automatically makes CASSANDRA-3362
much less of a problem too.)
> The tricky part is, compaction will (if not taught otherwise) mix repaired data together
with non-repaired.  So we should segregate unrepaired sstables from the repaired ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message