Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Mon, 12 Dec 2016 19:44:58 +0000 (UTC)
From: "Paulo Motta (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12998570.1471633458000.501704.1481571898434@Atlassian.JIRA>
In-Reply-To: <JIRA.12998570.1471633458000@Atlassian.JIRA>
References: <JIRA.12998570.1471633458000@Atlassian.JIRA> <JIRA.12998570.1471633458511@arcas>
Subject: [jira] [Commented] (CASSANDRA-12510) Disallow decommission when
 number of replicas will drop below configured RF
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Mon, 12 Dec 2016 19:45:00 -0000


    [ https://issues.apache.org/jira/browse/CASSANDRA-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742938#comment-15742938 ] 

Paulo Motta commented on CASSANDRA-12510:
-----------------------------------------

It seems you only updated {{stop_decommission_too_few_replicas_multi_dc_test}}, but there are a few other tests broken by this change ([simple_decommission_test|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.X-12510-dtest/lastCompletedBuild/testReport/topology_test/TestTopology/simple_decommission_test/], [add_and_remove_node_test|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.X-12510-dtest/lastCompletedBuild/testReport/pushed_notifications_test/TestPushedNotifications/add_and_remove_node_test/], etc)  ([full list|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.X-12510-dtest/lastCompletedBuild/testReport/]).

> Disallow decommission when number of replicas will drop below configured RF
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>         Environment: C* version 3.3
>            Reporter: Atin Sood
>            Assignee: Kurt Greaves
>            Priority: Minor
>              Labels: lhf
>         Attachments: 12510-3.x.patch
>
>
> Steps to replicate :
> - Create a 3 node cluster in DC1 and create a keyspace test_keyspace with table test_table with replication strategy NetworkTopologyStrategy , DC1=3 . Populate some data into this table.
> - Add 5 more nodes to this cluster, but in DC2. Also do not alter the keyspace to add the new DC2 to replication (this is intentional and the reason why the bug shows up). So the desc keyspace should still list NetworkTopologyStrategy with DC1=3 as RF
> - As expected, this will now be a 8 node cluster with 3 nodes in DC1 and 5 in DC2
> - Now start decommissioning the nodes in DC1. Note that the decommission runs fine on all the 3 nodes, but since the new nodes are in DC2 and the RF for keyspace is restricted to DC1, the new 5 nodes won't get any data.
> - You will now end with the 5 node cluster which has no data from the decommissioned 3 nodes and hence ending up in data loss
> I do understand that this problem could have been avoided if we perform an alter stmt and add DC2 replication before adding the 5 nodes. But the fact that decommission ran fine on the 3 nodes on DC1 without complaining that there were no nodes to stream its data seems a little discomforting. 


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)