From common-issues-return-159955-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Tue Oct 23 18:52:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 856F318066B for ; Tue, 23 Oct 2018 18:52:07 +0200 (CEST) Received: (qmail 75299 invoked by uid 500); 23 Oct 2018 16:52:06 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 75287 invoked by uid 99); 23 Oct 2018 16:52:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Oct 2018 16:52:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 10521182164 for ; Tue, 23 Oct 2018 16:52:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id VUZvop5Gl4ZT for ; Tue, 23 Oct 2018 16:52:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id BE0B45F52F for ; Tue, 23 Oct 2018 16:52:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 348A3E2635 for ; Tue, 23 Oct 2018 16:52:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A50F4252C3 for ; Tue, 23 Oct 2018 16:52:00 +0000 (UTC) Date: Tue, 23 Oct 2018 16:52:00 +0000 (UTC) From: "Hadoop QA (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-15864) Job submitter / executor fail when SBN domain name can not resolved MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-15864?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D16= 660958#comment-16660958 ]=20 Hadoop QA commented on HADOOP-15864: ------------------------------------ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s= {color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0= m 0s{color} | {color:green} The patch does not contain any @author tags. {= color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green}= 0m 0s{color} | {color:green} The patch appears to include 1 new or modif= ied test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 4s= {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}= 21m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 20= m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}= 3m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2= m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:gree= n} 18m 26s{color} | {color:green} branch has no errors when building and te= sting our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} = 3m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1= m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s= {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}= 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18= m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 18m = 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}= 3m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2= m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green}= 0m 0s{color} | {color:green} The patch has no whitespace issues. {color}= | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:gree= n} 11m 38s{color} | {color:green} patch has no errors when building and tes= ting our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} = 4m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1= m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m = 3s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 30s{col= or} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green}= 1m 59s{color} | {color:green} The patch does not generate ASF License war= nings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}232m 36s{colo= r} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | \\ \\ || Subsystem || Report/Notes || | Docker | Client=3D17.05.0-ce Server=3D17.05.0-ce Image:yetus/hadoop:4b8c2= b1 | | JIRA Issue | HADOOP-15864 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/1294521= 6/HADOOP-15864.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstal= l mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 76696949e6d0 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 = 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b618463 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HADOOP-Build/15411/artifac= t/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/1541= 1/testReport/ | | Max. process+thread count | 3014 (vs. ulimit of 10000) | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hado= op-hdfs U: . | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/154= 11/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Job submitter / executor fail when SBN domain name can not resolved > ------------------------------------------------------------------- > > Key: HADOOP-15864 > URL: https://issues.apache.org/jira/browse/HADOOP-15864 > Project: Hadoop Common > Issue Type: Bug > Reporter: He Xiaoqiao > Assignee: He Xiaoqiao > Priority: Critical > Attachments: HADOOP-15864-branch.2.7.001.patch, HADOOP-15864-bran= ch.2.7.002.patch, HADOOP-15864.003.patch > > > Job submit failure and Task executes failure if Standby NameNode domain n= ame can not resolved on HDFS HA with DelegationToken feature. > This issue is triggered when create {{ConfiguredFailoverProxyProvider}} i= nstance which invoke {{HAUtil.cloneDelegationTokenForLogicalUri}} in HA mod= e with Security. Since in HDFS HA mode UGI need include separate token for = each NameNode in order to dealing with Active-Standby switch, the double to= kens' content is same of course.=20 > However when #setTokenService in {{HAUtil.cloneDelegationTokenForLogicalU= ri}} it checks whether the address of NameNode has been resolved or not, if= Not, throw #IllegalArgumentException upon, then job submitter/ task execut= or fail. > HDFS-8068 and HADOOP-12125 try to fix it, but I don't think the two ticke= ts resolve completely. > Another questions many guys consider is why NameNode domain name can not = resolve? I think there are many scenarios, for instance node replace when m= eet fault, and refresh DNS sometimes. Anyway, Standby NameNode failure shou= ld not impact Hadoop cluster stability in my opinion. > a. code ref: org.apache.hadoop.security.SecurityUtil line373-386 > {code:java} > public static Text buildTokenService(InetSocketAddress addr) { > String host =3D null; > if (useIpForTokenService) { > if (addr.isUnresolved()) { // host has no ip address > throw new IllegalArgumentException( > new UnknownHostException(addr.getHostName()) > ); > } > host =3D addr.getAddress().getHostAddress(); > } else { > host =3D StringUtils.toLowerCase(addr.getHostName()); > } > return new Text(host + ":" + addr.getPort()); > } > {code} > b.exception log ref: > {code:xml} > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j= ava:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.= java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by:=C2=A0java.io.IOException: Couldn't create proxy provider class= org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(Nam= eNodeProxies.java:515) > at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.jav= a:170) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:761) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:691) > at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFil= eSystem.java:150) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2713) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2747= ) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2729) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:385) > at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.(ChRootedFileSyst= em.java:106) > at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewF= ileSystem.java:178) > at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewF= ileSystem.java:172) > at org.apache.hadoop.fs.viewfs.InodeTree.createLink(InodeTree.java:303) > at org.apache.hadoop.fs.viewfs.InodeTree.(InodeTree.java:377) > at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.(ViewFileSystem.jav= a:172) > at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.j= ava:172) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2713) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2747= ) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2729) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:385) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:176) > at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:665) > ... 35 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingCo= nstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(Nam= eNodeProxies.java:498) > ... 58 more > Caused by: java.lang.IllegalArgumentException:=C2=A0java.net.UnknownHostE= xception:=C2=A0standbynamenode > at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil= .java:390) > at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.j= ava:369) > at org.apache.hadoop.hdfs.HAUtil.cloneDelegationTokenForLogicalUri(HAUtil= .java:317) > at org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvi= der.(ConfiguredFailoverProxyProvider.java:132) > at org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvi= der.(ConfiguredFailoverProxyProvider.java:84) > ... 62 more > Caused by:=C2=A0java.net.UnknownHostException:=C2=A0standbynamenode > ... 67 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org