cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Minh Do (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6702) Upgrading node uses the wrong port in gossiping
Date Fri, 20 Jun 2014 02:05:24 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038266#comment-14038266
] 

Minh Do commented on CASSANDRA-6702:
------------------------------------

If I recalled correctly, this happened on C* 1.2 nodes while the cluster was still in a mixed
mode and the target nodes were seed nodes (C* 1.1.x).  After a while, gossips seemed to settle
down correctly on the right IPs and Ports.  However, this took some significant time depending
on the size of the cluster.

> Upgrading node uses the wrong port in gossiping
> -----------------------------------------------
>
>                 Key: CASSANDRA-6702
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6702
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 1.1.7, AWS, Ec2MultiRegionSnitch
>            Reporter: Minh Do
>            Priority: Minor
>             Fix For: 1.2.17
>
>
> When upgrading a node in 1.1.7 (or 1.1.11) cluster to 1.2.15 and inspecting the gossip
information on port/Ip, I could see that the upgrading node (1.2 version) communicates to
one other node in the same region using Public IP and non-encrypted port.
> For the rest, the upgrading node uses the correct ports and IPs to communicate in this
manner:
>    Same region: private IP and non-encrypted port 
>    and
>    Different region: public IP and encrypted port
> Because there is one node like this (or 2 out of 12 nodes cluster in which nodes are
split equally on 2 AWS regions), we have to modify Security Group to allow the new traffics.
> Without modifying the SG, the 95th and 99th latencies for both reads and writes in the
cluster are very bad (due to RPC timeout).  Inspecting closer, that upgraded node (1.2 node)
is contributing to all of the high latencies whenever it acts as a coordinator node. 
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message