Return-Path: X-Original-To: apmail-hadoop-hdfs-commits-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C26D11745 for ; Tue, 22 Jul 2014 20:29:20 +0000 (UTC) Received: (qmail 5213 invoked by uid 500); 22 Jul 2014 20:29:19 -0000 Delivered-To: apmail-hadoop-hdfs-commits-archive@hadoop.apache.org Received: (qmail 5160 invoked by uid 500); 22 Jul 2014 20:29:19 -0000 Mailing-List: contact hdfs-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-commits@hadoop.apache.org Received: (qmail 5149 invoked by uid 99); 22 Jul 2014 20:29:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jul 2014 20:29:19 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jul 2014 20:29:18 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id DD4472388831; Tue, 22 Jul 2014 20:28:57 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1612695 - in /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs: CHANGES.txt src/site/apt/HdfsMultihoming.apt.vm Date: Tue, 22 Jul 2014 20:28:57 -0000 To: hdfs-commits@hadoop.apache.org From: arp@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20140722202857.DD4472388831@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: arp Date: Tue Jul 22 20:28:57 2014 New Revision: 1612695 URL: http://svn.apache.org/r1612695 Log: HDFS-6712. Document HDFS Multihoming Settings. (Contributed by Arpit Agarwal) Added: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm Modified: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Modified: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt?rev=1612695&r1=1612694&r2=1612695&view=diff ============================================================================== --- hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt (original) +++ hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Tue Jul 22 20:28:57 2014 @@ -602,6 +602,8 @@ Release 2.5.0 - UNRELEASED HDFS-6680. BlockPlacementPolicyDefault does not choose favored nodes correctly. (szetszwo) + HDFS-6712. Document HDFS Multihoming Settings. (Arpit Agarwal) + OPTIMIZATIONS HDFS-6214. Webhdfs has poor throughput for files >2GB (daryn) Added: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm?rev=1612695&view=auto ============================================================================== --- hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm (added) +++ hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm Tue Jul 22 20:28:57 2014 @@ -0,0 +1,145 @@ +~~ Licensed under the Apache License, Version 2.0 (the "License"); +~~ you may not use this file except in compliance with the License. +~~ You may obtain a copy of the License at +~~ +~~ http://www.apache.org/licenses/LICENSE-2.0 +~~ +~~ Unless required by applicable law or agreed to in writing, software +~~ distributed under the License is distributed on an "AS IS" BASIS, +~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +~~ See the License for the specific language governing permissions and +~~ limitations under the License. See accompanying LICENSE file. + + --- + Hadoop Distributed File System-${project.version} - Support for Multi-Homed Networks + --- + --- + ${maven.build.timestamp} + +HDFS Support for Multihomed Networks + + This document is targetted to cluster administrators deploying <<>> in + multihomed networks. Similar support for <<>>/<<>> is + work in progress and will be documented when available. + +%{toc|section=1|fromDepth=0} + +* Multihoming Background + + In multihomed networks the cluster nodes are connected to more than one + network interface. There could be multiple reasons for doing so. + + [[1]] <>: Security requirements may dictate that intra-cluster + traffic be confined to a different network than the network used to + transfer data in and out of the cluster. + + [[2]] <>: Intra-cluster traffic may use one or more high bandwidth + interconnects like Fiber Channel, Infiniband or 10GbE. + + [[3]] <>: The nodes may have multiple network adapters + connected to a single network to handle network adapter failure. + + + Note that NIC Bonding (also known as NIC Teaming or Link + Aggregation) is a related but separate topic. The following settings + are usually not applicable to a NIC bonding configuration which handles + multiplexing and failover transparently while presenting a single 'logical + network' to applications. + +* Fixing Hadoop Issues In Multihomed Environments + +** Ensuring HDFS Daemons Bind All Interfaces + + By default <<>> endpoints are specified as either hostnames or IP addresses. + In either case <<>> daemons will bind to a single IP address making + the daemons unreachable from other networks. + + The solution is to have separate setting for server endpoints to force binding + the wildcard IP address <<>> i.e. <<<0.0.0.0>>>. Do NOT supply a port + number with any of these settings. + +---- + + dfs.namenode.rpc-bind-host + 0.0.0.0 + + The actual address the RPC server will bind to. If this optional address is + set, it overrides only the hostname portion of dfs.namenode.rpc-address. + It can also be specified per name node or name service for HA/Federation. + This is useful for making the name node listen on all interfaces by + setting it to 0.0.0.0. + + + + + dfs.namenode.servicerpc-bind-host + 0.0.0.0 + + The actual address the service RPC server will bind to. If this optional address is + set, it overrides only the hostname portion of dfs.namenode.servicerpc-address. + It can also be specified per name node or name service for HA/Federation. + This is useful for making the name node listen on all interfaces by + setting it to 0.0.0.0. + + + + + dfs.namenode.http-bind-host + 0.0.0.0 + + The actual adress the HTTP server will bind to. If this optional address + is set, it overrides only the hostname portion of dfs.namenode.http-address. + It can also be specified per name node or name service for HA/Federation. + This is useful for making the name node HTTP server listen on all + interfaces by setting it to 0.0.0.0. + + + + + dfs.namenode.https-bind-host + 0.0.0.0 + + The actual adress the HTTPS server will bind to. If this optional address + is set, it overrides only the hostname portion of dfs.namenode.https-address. + It can also be specified per name node or name service for HA/Federation. + This is useful for making the name node HTTPS server listen on all + interfaces by setting it to 0.0.0.0. + + +---- + +** Clients use Hostnames when connecting to DataNodes + + By default <<>> clients connect to DataNodes using the IP address + provided by the NameNode. Depending on the network configuration this + IP address may be unreachable by the clients. The fix is letting clients perform + their own DNS resolution of the DataNode hostname. The following setting + enables this behavior. + +---- + + dfs.client.use.datanode.hostname + true + Whether clients should use datanode hostnames when + connecting to datanodes. + + +---- + +** DataNodes use HostNames when connecting to other DataNodes + + Rarely, the NameNode-resolved IP address for a DataNode may be unreachable + from other DataNodes. The fix is to force DataNodes to perform their own + DNS resolution for inter-DataNode connections. The following setting enables + this behavior. + +---- + + dfs.datanode.use.datanode.hostname + true + Whether datanodes should use datanode hostnames when + connecting to other datanodes for data transfer. + + +---- +