Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Fri, 24 Jan 2014 19:26:42 +0000 (UTC)
From: "Brandon Williams (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12689045.1389806199922.8637.1390591602112@arcas>
In-Reply-To: <JIRA.12689045.1389806199922@arcas>
References: <JIRA.12689045.1389806199922@arcas>
Subject: [jira] [Commented] (CASSANDRA-6590) Gossip does not heal after a
 temporary partition at startup
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881339#comment-13881339 ] 

Brandon Williams commented on CASSANDRA-6590:
---------------------------------------------

Nevermind, the conflict was trivial.  Patch works, but causes a flap during initial startup, and then repeats the UP message when the partition heals:

{noformat}
 INFO 19:21:10,451 Node cassandra-3/10.179.111.137 state jump to normal
 INFO 19:21:10,472 Startup completed! Now serving reads.
 INFO 19:21:10,475 waiting for gossip to settle before accepting client requests...
 INFO 19:21:10,660 Handshaking version with cassandra-1/10.179.65.102
 INFO 19:21:11,633 Node /10.179.64.227 is now part of the cluster
 INFO 19:21:11,635 InetAddress /10.179.64.227 is now DOWN
 INFO 19:21:11,706 Node /10.179.65.102 is now part of the cluster
 INFO 19:21:11,707 Handshaking version with cassandra-1/10.179.65.102
 INFO 19:21:11,743 InetAddress /10.179.65.102 is now UP
 INFO 19:21:12,639 InetAddress /10.179.65.102 is now DOWN
 INFO 19:21:12,644 Handshaking version with cassandra-1/10.179.65.102
 INFO 19:21:12,648 InetAddress /10.179.65.102 is now UP
 INFO 19:21:18,476 gossip settled after 0 extra polls; proceeding
 INFO 19:21:18,589 Starting listening for CQL clients on cassandra-3/10.179.111.137:9042...
 INFO 19:21:18,657 Using TFramedTransport with a max frame size of 15728640 bytes.
 INFO 19:21:18,660 Binding thrift service to cassandra-3/10.179.111.137:9160
 INFO 19:21:18,672 Using synchronous/threadpool thrift server on cassandra-3 : 9160
 INFO 19:21:18,673 Listening for thrift clients...
 INFO 19:22:02,853 Handshaking version with /10.179.64.227
 INFO 19:22:03,844 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,845 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,846 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,844 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,859 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,860 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,860 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,859 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,859 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,861 InetAddress /10.179.64.227 is now UP
 INFO 19:22:03,860 InetAddress /10.179.64.227 is now UP
{noformat}


> Gossip does not heal after a temporary partition at startup
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-6590
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6590
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Vijay
>             Fix For: 2.0.5
>
>         Attachments: 0001-CASSANDRA-6590.patch, 6590_disable_echo.txt
>
>
> See CASSANDRA-6571 for background.  If a node is partitioned on startup when the echo command is sent, but then the partition heals, the halves of the partition will never mark each other up despite being able to communicate.  This stems from CASSANDRA-3533.


--
This message was sent by Atlassian JIRA
(v6.1.5#6160)