Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C1F4F9753 for ; Sun, 1 Apr 2012 00:04:49 +0000 (UTC) Received: (qmail 48435 invoked by uid 500); 1 Apr 2012 00:04:49 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 48394 invoked by uid 500); 1 Apr 2012 00:04:49 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 48386 invoked by uid 99); 1 Apr 2012 00:04:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Apr 2012 00:04:49 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Apr 2012 00:04:48 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 0A4513517EE for ; Sun, 1 Apr 2012 00:04:28 +0000 (UTC) Date: Sun, 1 Apr 2012 00:04:28 +0000 (UTC) From: "Todd Lipcon (Commented) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <749944756.2463.1333238668043.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1800114248.15268.1332710547862.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243614#comment-13243614 ] Todd Lipcon commented on HDFS-3150: ----------------------------------- Mostly looks good, just some nits: {code} + LOG.info("Opened streaming server at " + tmpPort); {code} This isn't the terminology used elsewhere. "Data transfer server" or "data transceiver server" is better ---- {code} // Connect to backup machine + final String dnName = targets[0].getName(connectToDnViaHostname); {code} I think better to call this {{mirrorName}} or {{mirrorAddrString}} ---- {code} + final String dnName = proxySource.getName(connectToDnViaHostname); + InetSocketAddress proxyAddr = NetUtils.createSocketAddr(dnName); {code} Similar here -- {{proxyDnName}} or {{proxyAddrString}} > Add option for clients to contact DNs via hostname in branch-1 > -------------------------------------------------------------- > > Key: HDFS-3150 > URL: https://issues.apache.org/jira/browse/HDFS-3150 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node, hdfs client > Reporter: Eli Collins > Assignee: Eli Collins > Attachments: hdfs-3150-b1.txt > > > Per the document attached to HADOOP-8198, this is just for branch-1, and unbreaks DN multihoming. The datanode can be configured to listen on a bond, or all interfaces by specifying the wildcard in the dfs.datanode.*.address configuration options, however per HADOOP-6867 only the source address of the registration is exposed to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming. In order to fix it let's add back the option for Datanodes to be accessed by hostname. This can be done by: > # Modifying the primary field of the Datanode descriptor to be the hostname, or > # Modifying Client/Datanode <-> Datanode access use the hostname field instead of the IP > I'd like to go with approach #2 as it does not require making an incompatible change to the client protocol, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. > New client and Datanode configuration options are introduced: > - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) > - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer > If the configuration options are not used, there is no change in the current behavior. > I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) based on the context the ID is being used in, vs always using the IP:xferPort as the Datanode's name, and using the name everywhere. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira