hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5098) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token
Date Tue, 17 May 2016 14:19:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15286696#comment-15286696

Jason Lowe commented on YARN-5098:

The original description of this JIRA showed that the HDFS token lifespan was configured to
a maximum of half a day, so it's not surprising that the nodemanagers cannot use those tokens
to successfully aggregate the application logs 3 days later.  Either the HDFS tokens need
a longer max lifespan for apps like this or the nodemanager needs to be updated with a new
HDFS token to use for log aggregation.  YARN-2704 added the ability to do the latter, but
I believe it requires a few things to work properly:
* the RM needs to be configured as a proxy user so it can request tokens on behalf of other
* the app needs to be submitted _without_ an HDFS token so the RM will acquire and manage
it directly on the app's behalf

Are these conditions true for this scenario?  cc: [~jianhe]

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation
> -------------------------------------------------------------------------------------------
>                 Key: YARN-5098
>                 URL: https://issues.apache.org/jira/browse/YARN-5098
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>            Reporter: Yesha Vora
> Environment : HA cluster
> Yarn application logs for long running application could not be gathered because Nodemanager
failed to talk to HDFS with below error.
> {code}
> 2016-05-16 18:18:28,533 INFO  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:finishLogAggregation(555))
- Application just finished : application_1463170334122_0002
> 2016-05-16 18:18:28,545 WARN  ipc.Client (Client.java:run(705)) - Exception encountered
while connecting to the server :
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
token (HDFS_DELEGATION_TOKEN token 171 for hrt_qa) can't be found in cache
>         at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
>         at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:583)
>         at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:398)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:752)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:748)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719)
>         at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:747)
>         at org.apache.hadoop.ipc.Client$Connection.access$3100(Client.java:398)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1597)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1439)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1386)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:240)
>         at com.sun.proxy.$Proxy83.getServerDefaults(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:282)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy84.getServerDefaults(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:1018)
>         at org.apache.hadoop.fs.Hdfs.getServerDefaults(Hdfs.java:156)
>         at org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:550)
>         at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:687)
> {code}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message