hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15864) Job submitter / executor fail when SBN domain name can not resolved
Date Fri, 19 Oct 2018 11:09:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656660#comment-16656660
] 

Hadoop QA commented on HADOOP-15864:
------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 58s{color} | {color:blue}
Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 44s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 52s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 53s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 18s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 37s{color}
| {color:green} branch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 48s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  0s{color} |
{color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 48s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 31s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 31s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 42s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  3s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  9m 58s{color}
| {color:green} patch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 46s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 56s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 27s{color} | {color:green}
hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 36s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m 26s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HADOOP-15864 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12944675/HADOOP-15864-branch.2.7.001.patch
|
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit
 shadedclient  findbugs  checkstyle  |
| uname | Linux b50a3af5d3fe 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 285d2c0 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/15394/testReport/ |
| Max. process+thread count | 1442 (vs. ulimit of 10000) |
| modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
|
| Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/15394/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Job submitter / executor fail when SBN domain name can not resolved
> -------------------------------------------------------------------
>
>                 Key: HADOOP-15864
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15864
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: He Xiaoqiao
>            Assignee: He Xiaoqiao
>            Priority: Critical
>         Attachments: HADOOP-15864-branch.2.7.001.patch
>
>
> Job submit failure and Task executes failure if Standby NameNode domain name can not
resolved on HDFS HA with DelegationToken feature.
> This issue is triggered when create {{ConfiguredFailoverProxyProvider}} instance which
invoke {{HAUtil.cloneDelegationTokenForLogicalUri}} in HA mode with Security. Since in HDFS
HA mode UGI need include separate token for each NameNode in order to dealing with Active-Standby
switch, the double tokens' content is same of course. 
> However when #setTokenService in {{HAUtil.cloneDelegationTokenForLogicalUri}} it checks
whether the address of NameNode has been resolved or not, if Not, throw #IllegalArgumentException
upon, then job submitter/ task executor fail.
> HDFS-8068 and HADOOP-12125 try to fix it, but I don't think the two tickets resolve completely.
> Another questions many guys consider is why NameNode domain name can not resolve? I think
there are many scenarios, for instance node replace when meet fault, and refresh DNS sometimes.
Anyway, Standby NameNode failure should not impact Hadoop cluster stability in my opinion.
> a. code ref: org.apache.hadoop.security.SecurityUtil line373-386
> {code:java}
>   public static Text buildTokenService(InetSocketAddress addr) {
>     String host = null;
>     if (useIpForTokenService) {
>       if (addr.isUnresolved()) { // host has no ip address
>         throw new IllegalArgumentException(
>             new UnknownHostException(addr.getHostName())
>         );
>       }
>       host = addr.getAddress().getHostAddress();
>     } else {
>       host = StringUtils.toLowerCase(addr.getHostName());
>     }
>     return new Text(host + ":" + addr.getPort());
>   }
> {code}
> b.exception log ref:
> {code:xml}
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:515)
> at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:170)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:761)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:691)
> at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:150)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2713)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2747)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2729)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:385)
> at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.<init>(ChRootedFileSystem.java:106)
> at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:178)
> at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:172)
> at org.apache.hadoop.fs.viewfs.InodeTree.createLink(InodeTree.java:303)
> at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:377)
> at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:172)
> at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:172)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2713)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2747)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2729)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:385)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:176)
> at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:665)
> ... 35 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:498)
> ... 58 more
> Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: standbynamenode
> at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:390)
> at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:369)
> at org.apache.hadoop.hdfs.HAUtil.cloneDelegationTokenForLogicalUri(HAUtil.java:317)
> at org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider.<init>(ConfiguredFailoverProxyProvider.java:132)
> at org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider.<init>(ConfiguredFailoverProxyProvider.java:84)
> ... 62 more
> Caused by: java.net.UnknownHostException: standbynamenode
> ... 67 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message