Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 30BA910021 for ; Tue, 29 Apr 2014 20:55:21 +0000 (UTC) Received: (qmail 27673 invoked by uid 500); 29 Apr 2014 20:55:18 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 27532 invoked by uid 500); 29 Apr 2014 20:55:16 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 27445 invoked by uid 99); 29 Apr 2014 20:55:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Apr 2014 20:55:15 +0000 Date: Tue, 29 Apr 2014 20:55:15 +0000 (UTC) From: "Ryan McGuire (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-7109) Create replace_address dtest MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984772#comment-13984772 ] Ryan McGuire commented on CASSANDRA-7109: ----------------------------------------- Shawn, The replace_address feature of cassandra is a JVM parameter telling a node that it is taking over for a node that is no longer a part of the cluster. Here's a quick set of tests I ran on the command line with ccm: {code} # Create a 3 node cluster: ccm create -v git:cassandra-2.1 -n 3 -s test # Create some data on it, with a replication factor of 3: ccm node1 stress -- write n=10000 -schema replication\(factor=3\) # Kill -9 node 3 ccm node3 stop --not-gently # Connect to node1 to test the data: ccm node1 cqlsh -- --cqlversion=3.1.6 localhost 9042 cqlsh> CONSISTENCY one; cqlsh> select * from "Keyspace1"."Standard1" LIMIT 1; cqlsh> CONSISTENCY two; cqlsh> select * from "Keyspace1"."Standard1" LIMIT 1; cqlsh> CONSISTENCY all; cqlsh> select * from "Keyspace1"."Standard1" LIMIT 1; # The last query there fails because it cannot get all replicas. # Add a new node to the cluster: ccm add node4 -i 127.0.0.4 -j 7400 -b # Tell it that it is replacing node3: ccm node4 start --replace-address=127.0.0.3 ccm node1 cqlsh -- --cqlversion=3.1.6 localhost 9042 cqlsh> CONSISTENCY all; cqlsh> select * from "Keyspace1"."Standard1" LIMIT 1; # Query works again. {code} Like Tyler Hobbs mentioned in this chat snippet, we need to test replacing an node that's down, replacing a node that is still active (shouldn't work hopefully.) and replacing a node that doesn't exist. > Create replace_address dtest > ---------------------------- > > Key: CASSANDRA-7109 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7109 > Project: Cassandra > Issue Type: Test > Reporter: Ryan McGuire > Assignee: Shawn Kumar > > {noformat} > 16:03 < driftx> well, this just bothers me because either it's been broken for almost ever, or something > broke in cassandra. > 16:43 < thobbs> driftx: I'm testing your patch on #6622, but I'm seeing a bit of a weird error: > 16:43 < CassBotJr> https://issues.apache.org/jira/browse/CASSANDRA-6622 (Unresolved; 1.2.16, 2.0.6): > "Streaming session failures during node replace of same address" > 16:43 < thobbs> java.lang.UnsupportedOperationException: Cannot replace token -1017822742317066613 which does > not exist! > 16:44 < thobbs> this is on 2.0 with the patch applied > 16:44 < driftx> O_o > 16:44 < thobbs> I'm just stopping a ccm node, clearing it, then starting with replace_address (auto_bootstrap > = true, not a seed, initial_tokens is null) > 16:45 < driftx> oh, I'm stupid, hang on > 16:47 < rcoli> is the sum of that that replace_* is still broken in 2.0 ? > 16:47 < rcoli> err, 1.2? > 16:48 < driftx> thobbs: updated the patch > 16:48 < thobbs> rcoli: only for replacing the same address > 16:48 < rcoli> is there another case? > 16:49 < driftx> replacing with a different address. > 16:49 < rcoli> oh, right, _address_ > 16:49 < rcoli> I'm still modeling this as replace _token_ > 16:49 < rcoli> in my brain > 16:49 < driftx> same address never broke for me though, so you can probably just retry > 16:55 < thobbs> can we add a dtest for replace_address coverage? It's kind of annoying to test manually and > we've managed to break it a few times > 16:56 < thobbs> I have a PR against ccm open to add replace_address support: > https://github.com/pcmanus/ccm/pull/85 > 16:57 < driftx> I could have sworn we had one > 16:58 < driftx> we do but it's using replace_token so probably not even running now > 16:58 < thobbs> yeah > 16:58 < thobbs> it would be nice to cover replacing the same address, another address, and expected failures > like replacing a still-live node > 16:59 < driftx> +1 > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)