Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 294D410519 for ; Thu, 20 Jun 2013 19:14:22 +0000 (UTC) Received: (qmail 35149 invoked by uid 500); 20 Jun 2013 19:14:21 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 35123 invoked by uid 500); 20 Jun 2013 19:14:21 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 35086 invoked by uid 99); 20 Jun 2013 19:14:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jun 2013 19:14:21 +0000 Date: Thu, 20 Jun 2013 19:14:21 +0000 (UTC) From: "Jason Brown (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-5669) Connection thrashing during multi-region ec2 during upgrade, due to messaging version MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-5669: ----------------------------------- Labels: ec2 ec2multiregionsnitch gossip (was: gossip) > Connection thrashing during multi-region ec2 during upgrade, due to messaging version > ------------------------------------------------------------------------------------- > > Key: CASSANDRA-5669 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5669 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.2.5 > Reporter: Jason Brown > Assignee: Jason Brown > Priority: Minor > Labels: ec2, ec2multiregionsnitch, gossip > Fix For: 1.2.6, 2.0 beta 1 > > Attachments: 5669-v1.diff, 5669-v2.diff > > > While debugging the upgrading scenario described in CASSANDRA-5660, I discovered the ITC.close() will reset the message protocol version of a peer node that disconnects. CASSANDRA-5660 has a full description of the upgrade path, but basically the Ec2MultiRegionSnitch will close connections on the publicIP addr to reconnect on the privateIp, and this causes ITC to drop the message protocol version of previously known nodes. I think we want to hang onto that version so that when the newer node (re-)connects to the lower node version, it passes the correct protocol version rather than the current version (too high for the older node),the connection attempt getting dropped, and going through the dance again. > To clarify, the 'thrashing' is at a rather low volume, from what I observed. Anecdotaly, perhaps one connection per second gets turned over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira