Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Tue, 9 Jul 2013 15:15:49 +0000 (UTC)
From: "Jason Brown (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12628167.1358495027992.18675.1373382949364@arcas>
In-Reply-To: <JIRA.12628167.1358495027992@arcas>
References: <JIRA.12628167.1358495027992@arcas>
Subject: [jira] [Comment Edited] (CASSANDRA-5171) Save EC2Snitch topology
 information in system table
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703203#comment-13703203 ] 

Jason Brown edited comment on CASSANDRA-5171 at 7/9/13 3:14 PM:
----------------------------------------------------------------

While this patch (v1 actually) was reverted in CASANDRA-5432, it wasn't satisfactorily answered why the patch failed to work as expected. I'm adding details here so we can get this ticket done right :).

First it's helpful to explore how a node can start gossip in EC2MRS with inter-DC (inter-region) enabled (and a Priam-type setup).

# ec2 instance is started, Priam comes up first and adds publicIP/sslPort to the security group's ingress privileges (so this node can accept connections on it's publicIP/sslPort from anywhere). 
# c* starts, and gets seed node public hostnames from Priam
# gossip to one of the seeds - the public hostname will resolve to the node's public IP addr.
# When OTC goes to write the first message on the seed, it gets a socket from OTCP.newSocket(). newSocket() calls isEncryptedChannel() to determine if we need to encrypt the data on the wire. As we don't know anything yet about the seed node (remember we havn't started gossip yet with anyone), isEncryptedChannel() will always return true when the following are true:
## internode_encryption != none
## we don't know the DC or RACK info for the remote node (which is the case when using the EC2MRS). This step is a little funky as OTCP calls the snitch for the seed's DC/RACK, to which EC2MRS will return UNKNOWN-DC/UNKNOWN-RACK, which will just happen to not match a value like "us-east-1" (the current's node's DC). 
# create the socket using remote node's publicIP addr on the SSL port.
# create the connection from and send messages successfully, assuming you've opened the SSL port for public addresses on the security group (which Priam handles).

Thus, if we are connecting to a node in the same EC2 region, we connect on the publicIP (as expected) but use the SSL port.

After we learn, via gossip, about a remote node's DC/RACK/localIP, we can choose to reconnect to nodes in the same region on the localIP/nonSSLPort.

The reason why Vijay's patch had problems here was because on restart, we would already know the DC/RACK from the previous execution of c* on this node, and the check in OTCP.isEncryptedChannel() returns false (do not use encryption), so a we choose to use the non-SSL port when creating a connection to the publicIP. Thus the connection creation unltimately fails because the non-SSL port is not opened for traffic on the security group for the public IP (nor should it be). EDIT: The other part of the problem is that we start the connection on the publicIP rather than localIP (INTERNAL_IP) even if we already have the localIP.

To make this patch work then, I think getting the localIP address in the OTCP's ctor would work the best. Code would look something like this:

{code}
    OutboundTcpConnectionPool(InetAddress remoteEp)
    {
        EndpointState epState =  Gossiper.instance.getEndpointStateForEndpoint(remoteEp);
        if(epState != null && epState.getApplicationState(ApplicationState.INTERNAL_IP) != null
            && epState.getApplicationState(ApplicationState.DC).equals(snitch.getDatacenter(FBUtilities.getBroadcastAddress()))
        {
            id = epState.getApplicationState(ApplicationState.INTERNAL_IP);             
        }
        else
        {
            id = remoteEp;
        }
        
        cmdCon = new OutboundTcpConnection(this);
        cmdCon.start();
        ackCon = new OutboundTcpConnection(this);
        ackCon.start();

        metrics = new ConnectionMetrics(id, this);
    }
{code}

Then you would connect on the localIP addr with the correct port (SSL or non-SSL).
                
      was (Author: jasobrown):
    While this patch (v1 actually) was reverted in CASANDRA-5432, it wasn't satisfactorily answered why the patch failed to work as expected. I'm adding details here so we can get this ticket done right :).

First it's helpful to explore how a node can start gossip in EC2MRS with inter-DC (inter-region) enabled (and a Priam-type setup).

# ec2 instance is started, Priam comes up first and adds publicIP/sslPort to the security group's ingress privileges (so this node can accept connections on it's publicIP/sslPort from anywhere). 
# c* starts, and gets seed node public hostnames from Priam
# gossip to one of the seeds - the public hostname will resolve to the node's public IP addr.
# When OTC goes to write the first message on the seed, it gets a socket from OTCP.newSocket(). newSocket() calls isEncryptedChannel() to determine if we need to encrypt the data on the wire. As we don't know anything yet about the seed node (remember we havn't started gossip yet with anyone), isEncryptedChannel() will always return true when the following are true:
## internode_encryption != none
## we don't know the DC or RACK info for the remote node (which is the case when using the EC2MRS). This step is a little funky as OTCP calls the snitch for the seed's DC/RACK, to which EC2MRS will return UNKNOWN-DC/UNKNOWN-RACK, which will just happen to not match a value like "us-east-1" (the current's node's DC). 
# create the socket using remote node's publicIP addr on the SSL port.
# create the connection from and send messages successfully, assuming you've opened the SSL port for public addresses on the security group (which Priam handles).

Thus, if we are connecting to a node in the same EC2 region, we connect on the publicIP (as expected) but use the SSL port.

After we learn, via gossip, about a remote node's DC/RACK/localIP, we can choose to reconnect to nodes in the same region on the localIP/nonSSLPort.

The reason why Vijay's patch had problems here was because on restart, we would already know the DC/RACK from the previous execution of c* on this node, and the check in OTCP.isEncryptedChannel() returns false (do not use encryption), so a we choose to use the non-SSL port when creating a connection to the publicIP. Thus the connection creation unltimately fails because the non-SSL port is not opened for traffic on the security group (nor should it be).

To make this patch work then, I think getting the localIP address in the OTCP's ctor would work the best. Code would look something like this:

{code}
    OutboundTcpConnectionPool(InetAddress remoteEp)
    {
        EndpointState epState =  Gossiper.instance.getEndpointStateForEndpoint(remoteEp);
        if(epState != null && epState.getApplicationState(ApplicationState.INTERNAL_IP) != null
            && epState.getApplicationState(ApplicationState.DC).equals(snitch.getDatacenter(FBUtilities.getBroadcastAddress()))
        {
            id = epState.getApplicationState(ApplicationState.INTERNAL_IP);             
        }
        else
        {
            id = remoteEp;
        }
        
        cmdCon = new OutboundTcpConnection(this);
        cmdCon.start();
        ackCon = new OutboundTcpConnection(this);
        ackCon.start();

        metrics = new ConnectionMetrics(id, this);
    }
{code}

Then you would connect on the localIP addr with the correct port (SSL or non-SSL).
                  
> Save EC2Snitch topology information in system table
> ---------------------------------------------------
>
>                 Key: CASSANDRA-5171
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5171
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>         Environment: EC2
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Critical
>             Fix For: 1.2.7
>
>         Attachments: 0001-CASSANDRA-5171.patch, 0001-CASSANDRA-5171-v2.patch
>
>
> EC2Snitch currently waits for the Gossip information to understand the cluster information every time we restart. It will be nice to use already available system table info similar to GPFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira