cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kurt Greaves (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12510) Disallow decommission when number of replicas will drop below configured RF
Date Tue, 13 Dec 2016 16:02:59 GMT


Kurt Greaves commented on CASSANDRA-12510:

this is true... and why coding at 4am is not a good idea. I've now patched the other tests
as well (except for the materialised views test which appears unrelated). I didn't get a pass
myself on the resumable_decommission_test but it got past the decommission stage and failed
at byteman (probably because of some misconfiguration on my end) so I think it should work.

> Disallow decommission when number of replicas will drop below configured RF
> ---------------------------------------------------------------------------
>                 Key: CASSANDRA-12510
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>         Environment: C* version 3.3
>            Reporter: Atin Sood
>            Assignee: Kurt Greaves
>            Priority: Minor
>              Labels: lhf
>         Attachments: 12510-3.x.patch
> Steps to replicate :
> - Create a 3 node cluster in DC1 and create a keyspace test_keyspace with table test_table
with replication strategy NetworkTopologyStrategy , DC1=3 . Populate some data into this table.
> - Add 5 more nodes to this cluster, but in DC2. Also do not alter the keyspace to add
the new DC2 to replication (this is intentional and the reason why the bug shows up). So the
desc keyspace should still list NetworkTopologyStrategy with DC1=3 as RF
> - As expected, this will now be a 8 node cluster with 3 nodes in DC1 and 5 in DC2
> - Now start decommissioning the nodes in DC1. Note that the decommission runs fine on
all the 3 nodes, but since the new nodes are in DC2 and the RF for keyspace is restricted
to DC1, the new 5 nodes won't get any data.
> - You will now end with the 5 node cluster which has no data from the decommissioned
3 nodes and hence ending up in data loss
> I do understand that this problem could have been avoided if we perform an alter stmt
and add DC2 replication before adding the 5 nodes. But the fact that decommission ran fine
on the 3 nodes on DC1 without complaining that there were no nodes to stream its data seems
a little discomforting. 

This message was sent by Atlassian JIRA

View raw message