From dev-return-67256-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Fri Feb 2 15:55:30 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id EC96D180608 for ; Fri, 2 Feb 2018 15:55:29 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DB7A7160C49; Fri, 2 Feb 2018 14:55:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0077A160C41 for ; Fri, 2 Feb 2018 15:55:28 +0100 (CET) Received: (qmail 34197 invoked by uid 500); 2 Feb 2018 14:55:28 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 34186 invoked by uid 99); 2 Feb 2018 14:55:27 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Feb 2018 14:55:27 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id C1E9DE9434; Fri, 2 Feb 2018 14:55:26 +0000 (UTC) From: anmolnar To: dev@zookeeper.apache.org Reply-To: dev@zookeeper.apache.org References: In-Reply-To: Subject: [GitHub] zookeeper pull request #451: ZOOKEEPER-2184: Zookeeper Client should re-reso... Content-Type: text/plain Message-Id: <20180202145526.C1E9DE9434@git1-us-west.apache.org> Date: Fri, 2 Feb 2018 14:55:26 +0000 (UTC) Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/451#discussion_r165665505 --- Diff: src/java/main/org/apache/zookeeper/client/StaticHostProvider.java --- @@ -91,15 +79,106 @@ public StaticHostProvider(Collection serverAddresses) { Collections.shuffle(this.serverAddresses); } + /** + * Evaluate to a hostname if one is available and otherwise it returns the + * string representation of the IP address. + * + * In Java 7, we have a method getHostString, but earlier versions do not support it. + * This method is to provide a replacement for InetSocketAddress.getHostString(). + * + * @param addr + * @return Hostname string of address parameter + */ + private String getHostString(InetSocketAddress addr) { + String hostString = ""; + + if (addr == null) { + return hostString; + } + if (!addr.isUnresolved()) { + InetAddress ia = addr.getAddress(); + + // If the string starts with '/', then it has no hostname + // and we want to avoid the reverse lookup, so we return + // the string representation of the address. + if (ia.toString().startsWith("/")) { + hostString = ia.getHostAddress(); + } else { + hostString = addr.getHostName(); + } + } else { + // According to the Java 6 documentation, if the hostname is + // unresolved, then the string before the colon is the hostname. + String addrString = addr.toString(); + hostString = addrString.substring(0, addrString.lastIndexOf(':')); + } + + return hostString; + } + public int size() { return serverAddresses.size(); } + // Counts the number of addresses added and removed during + // the last call to next. Used mainly for test purposes. + // See StasticHostProviderTest. + private int nextAdded = 0; + private int nextRemoved = 0; + + int getNextAdded() { + return nextAdded; + } + + int getNextRemoved() { + return nextRemoved; + } + public InetSocketAddress next(long spinDelay) { - ++currentIndex; - if (currentIndex == serverAddresses.size()) { - currentIndex = 0; + // Handle possible connection error by re-resolving hostname if possible + if (!connectedSinceNext) { --- End diff -- It should try to re-resolve whenever the client is unable to connect to a server (connectedSinceNext == false). @fpj gives a good explanation in the original Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2184?focusedCommentId=15873730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15873730 > I haven't had much time to work on this issue, but here is my current assessment. > This issue seemed easy to fix at first, but it is fairly fundamental with respect to how we resolve host names. Currently, we resolve host names when we start a client and never resolve it again. This is the cause of the problem reported in the issue because in the scenario described, the zookeeper container is re-started and changes addresses, which prevents the client from connecting to the zookeeper server. > The proposed patch here tries to re-resolve the hostname every time the client fails to connect to the resolved address. It kind of works, but it makes StaticHostProvider a bit messy because the expectation with the current wiring is that we won't have to resolve again. > The ideal situation for the problematic scenario is that we resolve the host name every time we try to connect to a server, but that would be a fairly fundamental change to how we resolve addresses in ZooKeeper. > I was also looking at the C client and it might get a bit messy too there because I don't think we currently keep the association between the host name and the resolved address, so we don't really know what to resolve again. It might be possible to do it via the canonical name in getaddrinfo, but I'm not sure how that works with windows. > One specific proposal to avoid having clients never finding a server ever again without deep changes to the current wiring is to resolve again everything in the case the client tries all and none succeeds. That would be a fairly straightforward change to both Java and C client, but it would not resolve addresses again in the case the a strict subset has changed addresses and at least one server is reachable. ---