cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anuj Wadehra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
Date Wed, 20 Jan 2016 02:57:40 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107876#comment-15107876
] 

Anuj Wadehra commented on CASSANDRA-10446:
------------------------------------------

I think, this an issue with the way we handled the "downed replica" scenario in repairs. We
should increase the priority and change the type from Improvement to Bug.

Consider following scenario and flow of events which demonstrate the importance of this issue:
Scenario: I have a 20 node clsuter, RF=5, Read/Write Quorum, gc grace period=20. My cluster
is fault tolerant and it can afford 2 node failures.

Suddenly, one node goes down due to some hardware issue. The failed node would prevent repair
on many nodes in the cluster as it has approximately 5/20th share of total data ..1/20 which
it owns and 4/20 which is stored as replica of data owned by other nodes. Now Its 10 days
since the node is down, most of the nodes are not being repaired and now its decision time.
I am not sure how soon the issue would be fixed may be next 2 days i.e. 8 days before gc grace,
so I shouldnt remove node early and add node back as it would cause significant and unnecessary
streaming due to token re-arrangement. At the same time, if I dont remove the failed node
at this time i.e. 10 days (much before gc grace), my entire system health would be in question
and it would be a panic situation as most of the data didnt get repaired in last 10 days and
gc grace is approaching. I need sufficient time to repair all nodes.
What looked like a fault tolerant Cassandra cluster which can easily afford 2 node failure,
required urgent attention and manual decision making when a single node went down. If some
replicas are down, we should allow Repair to proceed with remaining replicas. If failed nodes
comes up before gc grace period, we would run repair to fix inconsistencies and otheriwse
we would discard data and bootstrap. I think that would be a really robust fault tolerant
system.



> Run repair with down replicas
> -----------------------------
>
>                 Key: CASSANDRA-10446
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10446
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Priority: Minor
>             Fix For: 3.x
>
>
> We should have an option of running repair when replicas are down. We can call it -force.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message