cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-5171) Save EC2Snitch topology information in system table
Date Tue, 09 Jul 2013 15:15:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703203#comment-13703203
] 

Jason Brown edited comment on CASSANDRA-5171 at 7/9/13 3:14 PM:
----------------------------------------------------------------

While this patch (v1 actually) was reverted in CASANDRA-5432, it wasn't satisfactorily answered
why the patch failed to work as expected. I'm adding details here so we can get this ticket
done right :).

First it's helpful to explore how a node can start gossip in EC2MRS with inter-DC (inter-region)
enabled (and a Priam-type setup).

# ec2 instance is started, Priam comes up first and adds publicIP/sslPort to the security
group's ingress privileges (so this node can accept connections on it's publicIP/sslPort from
anywhere). 
# c* starts, and gets seed node public hostnames from Priam
# gossip to one of the seeds - the public hostname will resolve to the node's public IP addr.
# When OTC goes to write the first message on the seed, it gets a socket from OTCP.newSocket().
newSocket() calls isEncryptedChannel() to determine if we need to encrypt the data on the
wire. As we don't know anything yet about the seed node (remember we havn't started gossip
yet with anyone), isEncryptedChannel() will always return true when the following are true:
## internode_encryption != none
## we don't know the DC or RACK info for the remote node (which is the case when using the
EC2MRS). This step is a little funky as OTCP calls the snitch for the seed's DC/RACK, to which
EC2MRS will return UNKNOWN-DC/UNKNOWN-RACK, which will just happen to not match a value like
"us-east-1" (the current's node's DC). 
# create the socket using remote node's publicIP addr on the SSL port.
# create the connection from and send messages successfully, assuming you've opened the SSL
port for public addresses on the security group (which Priam handles).

Thus, if we are connecting to a node in the same EC2 region, we connect on the publicIP (as
expected) but use the SSL port.

After we learn, via gossip, about a remote node's DC/RACK/localIP, we can choose to reconnect
to nodes in the same region on the localIP/nonSSLPort.

The reason why Vijay's patch had problems here was because on restart, we would already know
the DC/RACK from the previous execution of c* on this node, and the check in OTCP.isEncryptedChannel()
returns false (do not use encryption), so a we choose to use the non-SSL port when creating
a connection to the publicIP. Thus the connection creation unltimately fails because the non-SSL
port is not opened for traffic on the security group for the public IP (nor should it be).
EDIT: The other part of the problem is that we start the connection on the publicIP rather
than localIP (INTERNAL_IP) even if we already have the localIP.

To make this patch work then, I think getting the localIP address in the OTCP's ctor would
work the best. Code would look something like this:

{code}
    OutboundTcpConnectionPool(InetAddress remoteEp)
    {
        EndpointState epState =  Gossiper.instance.getEndpointStateForEndpoint(remoteEp);
        if(epState != null && epState.getApplicationState(ApplicationState.INTERNAL_IP)
!= null
            && epState.getApplicationState(ApplicationState.DC).equals(snitch.getDatacenter(FBUtilities.getBroadcastAddress()))
        {
            id = epState.getApplicationState(ApplicationState.INTERNAL_IP);             
        }
        else
        {
            id = remoteEp;
        }
        
        cmdCon = new OutboundTcpConnection(this);
        cmdCon.start();
        ackCon = new OutboundTcpConnection(this);
        ackCon.start();

        metrics = new ConnectionMetrics(id, this);
    }
{code}

Then you would connect on the localIP addr with the correct port (SSL or non-SSL).
                
      was (Author: jasobrown):
    While this patch (v1 actually) was reverted in CASANDRA-5432, it wasn't satisfactorily
answered why the patch failed to work as expected. I'm adding details here so we can get this
ticket done right :).

First it's helpful to explore how a node can start gossip in EC2MRS with inter-DC (inter-region)
enabled (and a Priam-type setup).

# ec2 instance is started, Priam comes up first and adds publicIP/sslPort to the security
group's ingress privileges (so this node can accept connections on it's publicIP/sslPort from
anywhere). 
# c* starts, and gets seed node public hostnames from Priam
# gossip to one of the seeds - the public hostname will resolve to the node's public IP addr.
# When OTC goes to write the first message on the seed, it gets a socket from OTCP.newSocket().
newSocket() calls isEncryptedChannel() to determine if we need to encrypt the data on the
wire. As we don't know anything yet about the seed node (remember we havn't started gossip
yet with anyone), isEncryptedChannel() will always return true when the following are true:
## internode_encryption != none
## we don't know the DC or RACK info for the remote node (which is the case when using the
EC2MRS). This step is a little funky as OTCP calls the snitch for the seed's DC/RACK, to which
EC2MRS will return UNKNOWN-DC/UNKNOWN-RACK, which will just happen to not match a value like
"us-east-1" (the current's node's DC). 
# create the socket using remote node's publicIP addr on the SSL port.
# create the connection from and send messages successfully, assuming you've opened the SSL
port for public addresses on the security group (which Priam handles).

Thus, if we are connecting to a node in the same EC2 region, we connect on the publicIP (as
expected) but use the SSL port.

After we learn, via gossip, about a remote node's DC/RACK/localIP, we can choose to reconnect
to nodes in the same region on the localIP/nonSSLPort.

The reason why Vijay's patch had problems here was because on restart, we would already know
the DC/RACK from the previous execution of c* on this node, and the check in OTCP.isEncryptedChannel()
returns false (do not use encryption), so a we choose to use the non-SSL port when creating
a connection to the publicIP. Thus the connection creation unltimately fails because the non-SSL
port is not opened for traffic on the security group (nor should it be).

To make this patch work then, I think getting the localIP address in the OTCP's ctor would
work the best. Code would look something like this:

{code}
    OutboundTcpConnectionPool(InetAddress remoteEp)
    {
        EndpointState epState =  Gossiper.instance.getEndpointStateForEndpoint(remoteEp);
        if(epState != null && epState.getApplicationState(ApplicationState.INTERNAL_IP)
!= null
            && epState.getApplicationState(ApplicationState.DC).equals(snitch.getDatacenter(FBUtilities.getBroadcastAddress()))
        {
            id = epState.getApplicationState(ApplicationState.INTERNAL_IP);             
        }
        else
        {
            id = remoteEp;
        }
        
        cmdCon = new OutboundTcpConnection(this);
        cmdCon.start();
        ackCon = new OutboundTcpConnection(this);
        ackCon.start();

        metrics = new ConnectionMetrics(id, this);
    }
{code}

Then you would connect on the localIP addr with the correct port (SSL or non-SSL).
                  
> Save EC2Snitch topology information in system table
> ---------------------------------------------------
>
>                 Key: CASSANDRA-5171
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5171
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.1
>         Environment: EC2
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Critical
>             Fix For: 1.2.7
>
>         Attachments: 0001-CASSANDRA-5171.patch, 0001-CASSANDRA-5171-v2.patch
>
>
> EC2Snitch currently waits for the Gossip information to understand the cluster information
every time we restart. It will be nice to use already available system table info similar
to GPFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message