zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address
Date Fri, 17 Feb 2017 21:05:41 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872521#comment-15872521
] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
-------------------------------------------

Github user eribeiro commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/173#discussion_r101845264
  
    --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java ---
    @@ -181,6 +197,33 @@ public void recreateSocketAddresses() {
                 }
             }
     
    +        /**
    +         * Resolve the hostname to IP addresses, and find one reachable address.
    +         *
    +         * @param hostname the name of the host
    +         * @param timeout the time, in millseconds, before {@link InetAddress#isReachable}
    +         *                aborts
    +         * @return a reachable IP address. If no such IP address can be found,
    +         *         just return the first IP address of the hostname.
    +         *
    +         * @exception UnknownHostException
    +         */
    +        public InetAddress getReachableAddress(String hostname, int timeout) 
    +                throws UnknownHostException {
    +            InetAddress[] addresses = InetAddress.getAllByName(hostname);
    +            for (InetAddress a : addresses) {
    +                try {
    +                    if (a.isReachable(timeout)) {
    --- End diff --
    
    My main problem with this PR is that call to `isReachable(timeout)` for two reasons: 
    
    1) the most important one: `isReachable(timeout)` seems unreliable so there are plenty
cases where it returns false even tough the node is reachable or vice-versa! https://bugs.openjdk.java.net/browse/JDK-8159410
(google "InetAddress.isReachable not working" or "InetAddress.isReachable unreliable" to see
further cases).
    
    2) This timeout can add an arbitrary delay until a reachable node can be tested.
    
    IDK what a good compromise would be for both points above (leaving as it is today could
work, so no problem, even tough I am a bit concerned), but maybe we could use a solution similar
to ZOOKEEPER-2184 and return the next address in the array (using `next = ++next % addresses.length`
to prevent out of bound exceptions).
    



> recreateSocketAddresses may recreate the unreachable IP address
> ---------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2691
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.8
>         Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>            Reporter: JiangJiafu
>            Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the hostname
to a new IP address(InetAddress) when any exception happens to the socket. It will be very
useful when a hostname can be resolved to more than one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will always return
the first IP address when the hostname can be resolved to more than one IP address, and the
first IP address may be unreachable forever. For example, if a machine has two network interfaces:
eth0, eth1, say eth0 has ip1, eth1 has ip2, the relationship between hostname and the IP addresses
is set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the InetAddress.getByName(String
hostname)  will still return ip1, which is unreachable forever.
> So I think it will be better to check the IP address by InetAddress.isReachable(long)
and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own environment,
and it can work very well when I turn down some network interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
>         public void recreateSocketAddresses() {
>             InetAddress address = null;
>             try {
>                 address = InetAddress.getByName(this.hostname);
>                 LOG.info("Resolved hostname: {} to address: {}", this.hostname, address);
>                 this.addr = new InetSocketAddress(address, this.port);
>                 if (this.electionPort > 0){
>                     this.electionAddr = new InetSocketAddress(address, this.electionPort);
>                 }
>             } catch (UnknownHostException ex) {
>                 LOG.warn("Failed to resolve address: {}", this.hostname, ex);
>                 // Have we succeeded in the past?
>                 if (this.addr != null) {
>                     // Yes, previously the lookup succeeded. Leave things as they are
>                     return;
>                 }
>                 // The hostname has never resolved. Create our InetSocketAddress(es)
as unresolved
>                 this.addr = InetSocketAddress.createUnresolved(this.hostname, this.port);
>                 if (this.electionPort > 0){
>                     this.electionAddr = InetSocketAddress.createUnresolved(this.hostname,
>                                                                            this.electionPort);
>                 }
>             }
>         }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
>         public void recreateSocketAddresses() {
>             InetAddress address = null;
>             try {
>                 address = getReachableAddress(this.hostname);
>                 LOG.info("Resolved hostname: {} to address: {}", this.hostname, address);
>                 this.addr = new InetSocketAddress(address, this.port);
>                 if (this.electionPort > 0){
>                     this.electionAddr = new InetSocketAddress(address, this.electionPort);
>                 }
>             } catch (UnknownHostException ex) {
>                 LOG.warn("Failed to resolve address: {}", this.hostname, ex);
>                 // Have we succeeded in the past?
>                 if (this.addr != null) {
>                     // Yes, previously the lookup succeeded. Leave things as they are
>                     return;
>                 }
>                 // The hostname has never resolved. Create our InetSocketAddress(es)
as unresolved
>                 this.addr = InetSocketAddress.createUnresolved(this.hostname, this.port);
>                 if (this.electionPort > 0){
>                     this.electionAddr = InetSocketAddress.createUnresolved(this.hostname,
>                                                                            this.electionPort);
>                 }
>             }
>         }
>         public InetAddress getReachableAddress(String hostname) throws UnknownHostException
{
>             InetAddress[] addresses = InetAddress.getAllByName(hostname);
>             for (InetAddress a : addresses) {
>                 try {
>                     if (a.isReachable(5000)) {
>                         return a;
>                     } 
>                 } catch (IOException e) {
>                     LOG.warn("IP address {} is unreachable", a);
>                 }
>             }
>             // All the IP address is unreachable, just return the first one.
>             return addresses[0];
>         }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message