From common-issues-return-148704-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Thu Feb 22 16:12:06 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 20B9A18067E for ; Thu, 22 Feb 2018 16:12:04 +0100 (CET) Received: (qmail 70511 invoked by uid 500); 22 Feb 2018 15:12:04 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 70500 invoked by uid 99); 22 Feb 2018 15:12:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Feb 2018 15:12:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 87982C06C9 for ; Thu, 22 Feb 2018 15:12:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.511 X-Spam-Level: X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id lxUUjBUs_svs for ; Thu, 22 Feb 2018 15:12:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 67A8D5F260 for ; Thu, 22 Feb 2018 15:12:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A0AB5E00D6 for ; Thu, 22 Feb 2018 15:12:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5DD5F27129 for ; Thu, 22 Feb 2018 15:12:00 +0000 (UTC) Date: Thu, 22 Feb 2018 15:12:00 +0000 (UTC) From: "Axton Grams (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-15250?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D16= 372896#comment-16372896 ]=20 Axton Grams edited comment on HADOOP-15250 at 2/22/18 3:11 PM: --------------------------------------------------------------- I work with Greg on the same clusters.=C2=A0 To add some color to the DNS/s= plit view configuration: * DNS is configured with 2 views: ** Internal: Used by cluster machines to resolve=C2=A0Hadoop=C2=A0nodes to= cluster network segment IP ** External: Used by non-cluster machines to resolve Hadoop nodes to routa= ble network segment IP * All nodes with a presence on the cluster network resolve machines to the= cluster (non-routable) IP address * All nodes without a presence on the cluster network resolve machines to = the routable IP address We implemented this pattern for the following reasons: * We can allow unfettered access (iptables/firewalld) between cluster node= s * We use jumbo frames on the cluster network to ease network load You have to understand that the interface the service binds to is condition= al depending on the origin of the traffic, not how the server knows itself = according to DNS or Kerberos.=C2=A0 Different nodes know the server by the = same name with different IP addresses, depending on whether they have a pre= sence on the cluster network segment.=C2=A0 All Hadoop nodes know themselve= s by the cluster IP address, which is non-routable. This design is compatible with the Linux network stack, DNS view practices,= multi-homing practices, and all other related technology domains, just not= Hadoop. We operate=C2=A0with=C2=A0the following assumptions: * The network stack provided by the OS knows how to properly route traffic * The information in DNS is properly managed and accurate * The hostname matches the Kerberos principal name, but the IP answer is d= ifferent different for different clients was (Author: agrams): I work with Greg on the same clusters.=C2=A0 To add some color to the DNS/s= plit view configuration: * DNS is configured with 2 views: ** Internal: Used by cluster machines to resolve=C2=A0Hadoop=C2=A0nodes to= cluster network segment IP ** External: Used by non-cluster machines to resolve Hadoop nodes to routa= ble network segment IP * All nodes with a presence on the cluster network resolve machines to the= cluster (non-routable) IP address * All nodes without a presence on the cluster network resolve machines to = the routable IP address We implemented this pattern for the following reasons: * We can allow unfettered access (iptables/firewalld) between cluster node= s * We use jumbo frames on the cluster network to ease network load You have to understand that the interface the service binds to is condition= al depending on the origin of the traffic, not how the server knows itself = according to DNS or Kerberos.=C2=A0 Different nodes know the server by the = same name with different IP addresses, depending on whether they have a pre= sence on the cluster network segment.=C2=A0 All Hadoop nodes know themselve= s by the cluster IP address, which is non-routable. This design is compatible with the Linux network stack, DNS view practices,= multi-homing practices, and all other related technology domains, just not= Hadoop. > Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr = Wrong > -------------------------------------------------------------------------= ----- > > Key: HADOOP-15250 > URL: https://issues.apache.org/jira/browse/HADOOP-15250 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, net > Affects Versions: 2.7.3, 2.9.0, 3.0.0 > Environment: Multihome cluster with split DNS and rDNS lookup of = localhost returning non-routable IPAddr > Reporter: Greg Senia > Priority: Critical > Attachments: HADOOP-15250.patch > > > We run=C2=A0our Hadoop clusters with two networks attached to each node. = These network are as follows a server network that is firewalled with firew= alld allowing inbound traffic: only SSH and things like=C2=A0Knox and Hives= erver2 and the=C2=A0HTTP YARN RM/ATS and MR History Server. The second netw= ork is the cluster network on the second network interface this uses Jumbo = frames and is open no restrictions and allows all cluster traffic to flow b= etween nodes.=C2=A0 > =C2=A0 > To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if = the traffic is originating from nodes with cluster networks we return the i= nternal DNS record for the nodes. This all works fine with all the multi-ho= ming features added to Hadoop 2.x > =C2=A0Some logic around views: > a. The internal view is used by cluster machines when performing lookups.= So hosts on the cluster network should get answers from the internal view = in DNS > b. The external view is used by non-local-cluster machines when performin= g lookups. So hosts not on the cluster network should get answers from the = external view in DNS > =C2=A0 > So this brings me to our problem. We created some firewall rules to allow= inbound traffic from each clusters server network to allow distcp to occur= . But we noticed a problem almost immediately that when YARN attempted to t= alk to the Remote Cluster it was binding outgoing traffic to the cluster ne= twork interface which IS NOT routable. So after researching the code we not= iced the following in NetUtils.java and Client.java=C2=A0 > Basically in Client.java it looks as if it takes whatever the hostname is= and attempts to bind to whatever the hostname is resolved to. This is not = valid in a multi-homed network with one routable interface and one non rout= able interface. After reading through the java.net.Socket documentation it = is valid to perform socket.bind(null) which will allow the OS routing table= and DNS to send the traffic to the correct interface. I will also attach t= he nework traces and a test patch for 2.7.x and 3.x code base. I have this = test fix below in my Hadoop Test Cluster. > Client.java: > =C2=A0=C2=A0 =C2=A0 =C2=A0 > |/*| > |=C2=A0| * Bind the socket to the host specified in the principal name of= the| > |=C2=A0| * client, to ensure Server matching address of the client connec= tion| > |=C2=A0| * to host name in principal passed.| > |=C2=A0| */| > |=C2=A0|InetSocketAddress bindAddr =3D null;| > |=C2=A0|if (ticket !=3D null && ticket.hasKerberosCredentials()) {| > |=C2=A0|KerberosInfo krbInfo =3D| > |=C2=A0|remoteId.getProtocol().getAnnotation(KerberosInfo.class);| > |=C2=A0|if (krbInfo !=3D null) {| > |=C2=A0|String principal =3D ticket.getUserName();| > |=C2=A0|String host =3D SecurityUtil.getHostFromPrincipal(principal);| > |=C2=A0|// If host name is a valid local address then bind socket to it| > |=C2=A0|{color:#FF0000}*InetAddress localAddr =3D NetUtils.getLocalInetAd= dress(host);*{color}| > |{color:#FF0000}=C2=A0**=C2=A0{color}|if (localAddr !=3D null) {| > |=C2=A0|this.socket.setReuseAddress(true);| > |=C2=A0|if (LOG.isDebugEnabled()) {| > |=C2=A0|LOG.debug("Binding " + principal + " to " + localAddr);| > |=C2=A0|}| > |=C2=A0|*{color:#FF0000}bindAddr =3D new InetSocketAddress(localAddr, 0);= {color}*| > |=C2=A0*{color:#FF0000}{color}*=C2=A0|*{color:#FF0000}}{color}*| > |=C2=A0|}| > |=C2=A0|}| > =C2=A0 > So in my Hadoop 2.7.x Cluster I made the following changes and traffic fl= ows correctly out the correct interfaces: > =C2=A0 > diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache= /hadoop/fs/CommonConfigurationKeys.java b/hadoop-common-project/hadoop-comm= on/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > index e1be271..c5b4a42 100644 > --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop= /fs/CommonConfigurationKeys.java > +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop= /fs/CommonConfigurationKeys.java > @@ -305,6 +305,9 @@ > =C2=A0=C2=A0 public static final String=C2=A0 IPC_CLIENT_FALLBACK_TO_SIMP= LE_AUTH_ALLOWED_KEY =3D "ipc.client.fallback-to-simple-auth-allowed"; > =C2=A0=C2=A0 public static final boolean IPC_CLIENT_FALLBACK_TO_SIMPLE_AU= TH_ALLOWED_DEFAULT =3D false; > =C2=A0 > +=C2=A0 public static final String=C2=A0 IPC_CLIENT_NO_BIND_LOCAL_ADDR_KE= Y =3D "ipc.client.nobind.local.addr"; > +=C2=A0 public static final boolean IPC_CLIENT_NO_BIND_LOCAL_ADDR_DEFAULT= =3D false; > + > =C2=A0=C2=A0 public static final String IPC_CLIENT_CONNECT_MAX_RETRIES_ON= _SASL_KEY =3D > =C2=A0=C2=A0 =C2=A0 "ipc.client.connect.max.retries.on.sasl"; > =C2=A0=C2=A0 public static final int=C2=A0 =C2=A0 IPC_CLIENT_CONNECT_MAX_= RETRIES_ON_SASL_DEFAULT =3D 5; > diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache= /hadoop/ipc/Client.java b/hadoop-common-project/hadoop-common/src/main/java= /org/apache/hadoop/ipc/Client.java > index a6f4eb6..7bfddb7 100644 > --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop= /ipc/Client.java > +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop= /ipc/Client.java > @@ -129,7 +129,9 @@ public static void setCallIdAndRetryCount(int cid, in= t rc) { > =C2=A0 > =C2=A0=C2=A0 private final int connectionTimeout; > =C2=A0 > + > =C2=A0=C2=A0 private final boolean fallbackAllowed; > +=C2=A0 private final boolean noBindLocalAddr; > =C2=A0=C2=A0 private final byte[] clientId; > =C2=A0 =C2=A0 > =C2=A0=C2=A0 final static int CONNECTION_CONTEXT_CALL_ID =3D -3; > @@ -642,7 +644,11 @@ private synchronized void setupConnection() throws I= OException { > =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 InetAddress localA= ddr =3D NetUtils.getLocalInetAddress(host); > =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (localAddr !=3D= null) { > =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 this.socket= .setReuseAddress(true); > -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 this.socket.bind= (new InetSocketAddress(localAddr, 0)); > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (noBindLocalA= ddr) { > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 this.sock= et.bind(null); > + } else { > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 this.sock= et.bind(new InetSocketAddress(localAddr, 0)); > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } > =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org