curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Rankin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-392) Zookeeper Ensemble Get Incorrect Address
Date Thu, 09 Mar 2017 19:50:38 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903735#comment-15903735
] 

Alex Rankin commented on CURATOR-392:
-------------------------------------

[~randgalt] - Ah, sorry. I must have got confused with the Zookeeper board. I'll open a pull
request then.

> Zookeeper Ensemble Get Incorrect Address
> ----------------------------------------
>
>                 Key: CURATOR-392
>                 URL: https://issues.apache.org/jira/browse/CURATOR-392
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 3.2.1
>         Environment: ZooKeeper 3.5.1-alpha
>            Reporter: Alex Rankin
>         Attachments: CURATOR-392-01.patch, CURATOR-392-02.patch
>
>
> I've noticed an issue with Curator 3.2.1 which relates to the fix from CURATOR-345 (also
reported by me).
> When we would reconnect after losing connection to Zookeeper (due to network issues),
our services would always have the wrong connection string, and never manage to reconnect
to the Zookeeper cluster. Assuming that 10.1.2.3 is our zookeeper server, and we have two
scenarios (with different zoo.cfg files) we were seeing the following results when a reconnection
was established:
> {quote}
> *Scenario 1:* ClientCnxn - Opening socket connection to server 0.0.0.0/0.0.0.0:2181.
> *Scenario 2:* ClientCnxn - Opening socket connection to server 10.1.2.3/10.1.2.3:2888.
> {quote}
> Obviously these are both undesirable connection strings, as both are wrong. The issue
arises in the EnsembleTracker.processConfigData() when we reconnect to Zookeeper. The config
coming from zookeeper is in [the format|https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html#sc_reconfig_clientport]:
> {quote}
> server.<positive id> = <address1>:<port1>:<port2>\[:role\];\[<client
port address>:\]<client port>
> {quote}
> As we can see, both \[:role\] and \[<client port address>:\] are optional. Hence,
the following string is perfectly valid:
> {quote}
> server.1=10.1.2.3:2888:3888:participant;2181
> {quote}
> When Zookeeper sends this, it defaults the clientAddress to 0.0.0.0, so we retrieve the
following value in EnsembleTracker:
> {quote}
> server.1=10.1.2.3:2888:3888:participant;0.0.0.0:2181
> {quote}
> The resulting connection string, therefore, turns in to 0.0.0.0:2181 instead of 10.1.2.3:2181,
and Curator creates a new ZooKeeper to connect to that IP - which obviously never works.
> In the second scenario, our connection string looks a bit different. It is wrong according
to the docs, but is valid:
> {quote}
> server.1=10.1.2.3:2888:3888:participant
> {quote}
> Now, this is missing the client port and address. That means that the resulting string
from the EnsembleTracker is 10.1.2.3:2888 - which isn't desired. Including the port would
just lead to the above scenario.
> From what I can see, the EnsembleTracker.configToConnectionString() method is the issue
here:
> {code}
> InetSocketAddress address = Objects.firstNonNull(server.clientAddr, server.addr);
>             sb.append(address.getAddress().getHostAddress()).append(":").append(address.getPort());
> {code}
> In the above cases, both the server.Addr and server.clientAddr values are wrong. We also
prefer the value of clientAddr for some reason, which doesn't look right to me (given that
it can be 0.0.0.0 or 127.0.0.1).
> It seems to me that Curator should use server.Addr.getHostAddress() with server.clientAddr.getPort().
When the clientAddr is missing, however, I'm not sure what should be done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message