cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Bailey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11461) Failed incremental repairs never cleared from pending list
Date Wed, 30 Mar 2016 23:13:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219041#comment-15219041
] 

Nick Bailey commented on CASSANDRA-11461:
-----------------------------------------

Yeah. So OpsCenter lets you configure some tables for incremental repair and some for normal
subrange repair, which is what was happening in this case. So OpsCenter is doing:

* Break up the ring into small chunks for subrange repair
* Visit a node and repair a small range for all tables that are using subrange repair
* If any tables are configured for incremental repair, run an incremental repair on those
tables
** By default this would do a full incremental repair on those tables, which is what was in
use when this bug was hit
* Jump across the ring to a different node and repeat the above process.

It does all this in a single datacenter, since opscenter does cross dc repair.

That's at least the very high level overview.

> Failed incremental repairs never cleared from pending list
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-11461
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11461
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Adam Hattrell
>
> Set up a test cluster with 2 DC's, heavy use of LCS (not sure if that's relevant).
> Kick off cassandra-stress against it.
> Kick of an automated incremental repair cycle.  
> After a bit a node starts flapping which causes a few repairs to fail.  This is never
cleared out of pending repairs - given the keyspace is replicated to all nodes it means they
all have pending repairs that will never complete.  Repairs  are basically blocked at this
point.
> Given we're using Incremental repairs you're now spammed with:
> "Cannot start multiple repair sessions over the same sstables"
> Cluster and logs are still available for review - message me for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message