cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9143) Improving consistency of repairAt field across replicas
Date Fri, 26 Aug 2016 22:30:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440059#comment-15440059
] 

Paulo Motta commented on CASSANDRA-9143:
----------------------------------------

bq. Both are really manifestations of the same root problem: incremental repair behaves unpredictably
because data being repaired isn't kept separate from unrepaired data during repair. Maybe
we should expand the problem description, and close CASSANDRA-8858 as a dupe?

Thanks for clarifying, we should definitely update the title and description since a more
general problem is being tackled here from the one originally stated on the ticket. I agree
we should close CASSANDRA-8858 since that will be superseded by this.

bq. We’d have to be optimistic and anti-compact all the tables and ranges we’re going
to be repairing prior to validation. Obviously, failed ranges would have to be re-anticompacted
back into unrepaired. The cost of this would have to be compared to the higher network io
caused by the current state of things, and the frequency of failed ranges.

I think that's a good idea and could also help mitigate repair impact on vnodes due to multiple
flushes to run validations for every vnode (CASSANDRA-9491, CASSANDRA-10862), since we would
only validate the anti-compacted sstables from the beginning of the parent repair session.
On the other hand, we should think carefully about how sstables in the pending repair bucket
will be handled, since holding compaction of these sstables for a long time could lead to
poor read performance and extra compaction I/O after repair

For frequently running incremental repair this shouldn't be a problem since repairs should
be fast, but if many unrepaired sstables pile up (or in the case of full repairs), then this
could become a problem. One approach would be to skip upfront anti-compaction if unrepaired
set is above some size treshold (or full repairs) and fall back to anti-compaction at the
end as done now. Also, there sould probably be some safety mechanism (timeout, etc) that releases
sstables from the pending repair bucket if they're there for a long time as Marcus suggested
on CASSANDRA-5351.

> Improving consistency of repairAt field across replicas 
> --------------------------------------------------------
>
>                 Key: CASSANDRA-9143
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9143
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Blake Eggleston
>            Priority: Minor
>
> We currently send an anticompaction request to all replicas. During this, a node will
split stables and mark the appropriate ones repaired. 
> The problem is that this could fail on some replicas due to many reasons leading to problems
in the next repair. 
> This is what I am suggesting to improve it. 
> 1) Send anticompaction request to all replicas. This can be done at session level. 
> 2) During anticompaction, stables are split but not marked repaired. 
> 3) When we get positive ack from all replicas, coordinator will send another message
called markRepaired. 
> 4) On getting this message, replicas will mark the appropriate stables as repaired. 
> This will reduce the window of failure. We can also think of "hinting" markRepaired message
if required. 
> Also the stables which are streaming can be marked as repaired like it is done now. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message