Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Fri, 20 Nov 2015 21:32:11 +0000 (UTC)
From: "Paulo Motta (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12861117.1441140585000.142075.1448055131157@Atlassian.JIRA>
In-Reply-To: <JIRA.12861117.1441140585000@Atlassian.JIRA>
References: <JIRA.12861117.1441140585000@Atlassian.JIRA>
 <JIRA.12861117.1441140585593@arcas>
Subject: [jira] [Commented] (CASSANDRA-10243) Warn or fail when changing
 cluster topology live
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018849#comment-15018849 ] 

Paulo Motta commented on CASSANDRA-10243:
-----------------------------------------

Finished second part of review and don't have much to add besides the previous comments. Very nice and comprehensive dtest and unit test suite, congratulations!

I wasn't very familiar with PropertyFileSnitch and YamlPropertyFileSnitch so took a bit longer to review those, specially the default rack/dc thing. I don't see much point in keeping PropertyFileSnitch around (and having to maintain it), given you can achieve the same, and even more, in a much simpler way with GossipingPropertyFileSnitch, so created CASSANDRA-10745 to deprecate the PropertyFileSnitch.

While this is very well tested and CASSANDRA-10242 and CASSANDRA-9474 don't make much sense without this patch this is quite a bit of code to add in the end of 2.1, so I'll leave to the committer to decide if this should go into 2.1, but I guess it should be OK.

Addressing your previous comments:

bq. my preference would be to leave existing code unchanged, especially if this goes to 2.1, but I am not opposed to simplifying the new liveliness check for the snitch to what you suggested

+1

bq. I don't see why wait for up to 60 seconds before reloading a config file, 5 seconds is a pretty long time and it should not have any adverse impact.

this file is rarely ever changed, and now even less, so 60 seconds is more than enough, but if lowering makes testing easier I guess it should be fine

bq. maybe we should never allow chaning dc/rack for GPFS, or remove the config reload altogether as suggested in 

+1, we should keep GPFS as simple as possible, and I don't see much sense in reloading only prefer_local. You can maybe just reuse the [patch|https://issues.apache.org/jira/secure/attachment/12738530/cassandra-2.1-9474.patch] from CASSANDRA-9474 which is ready.

bq. Should we add a JVM property to override the liveliness checks, just as a safety measure in case someone has a legitimate reason to change rack/dc of a live node?

I don't see a legitimate reason to change the rack/dc of a live node and restarting the node in this case shouldn't be a big deal, so better avoid adding new properties IMO.

Good job!

> Warn or fail when changing cluster topology live
> ------------------------------------------------
>
>                 Key: CASSANDRA-10243
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10243
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 2.1.x
>
>
> Moving a node from one rack to another in the snitch, while it is alive, is almost always the wrong thing to do.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)