hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5098) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token
Date Fri, 27 May 2016 20:09:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304685#comment-15304685
] 

Vinod Kumar Vavilapalli commented on YARN-5098:
-----------------------------------------------

Fix checkstyle and unit test issues.

The patch looks good overall, few comments
 - Overall, I think we can simplify this code if we simply always manage our own tokens for
localization and log-aggregation for long-running applications / services. Today, it's too
complicated: for the first day, we use the user's token T, second day we get a new token T'
but share it for all the apps originally sharing T, after RM restart we use a new token T''
which is different for each of the apps originally sharing T. We can simplify this by always
managing it ourselves and managing them per-user!
 - There are a few unused imports.
 - Unrelated to the patch, but let's rename requestNewHdfsDelegationToken() -> requestNewHdfsDelegationTokenAsProxyUser()
 - Why this change?
{code}
-    LOG.info("Renewed delegation-token= [" + dttr + "], for "
-        + dttr.referringAppIds);
+    LOG.info("Renewed delegation-token= [" + dttr + "]");
{code}
 - Testcase
    -- Should use the same user for both tokens?
    -- Add a comment saying the rm2 is simulating RM restart
    -- Can we rewrite the following, it is a little confusing
{code}
            if (dttr.token.equals(expectedToken)) {
              secondRenewInvoked = true;
              super.renewToken(dttr);
            } else {
              firstRenewInvoked = true;
              throw new InvalidToken("Failed to renew");
            }
{code}
to
{code}
            if (dttr.token.equals(updatedtoken)) {
              super.renewToken(dttr);
            } else if (dttr.token.equals(originalToken) {
              throw new InvalidToken("Failed to renew");
            } else {
              throw new IOException("Unexpected");
            }
{code}
and assert that firstRenewInvoked and secondRenewInvoked are set?

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation
token
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-5098
>                 URL: https://issues.apache.org/jira/browse/YARN-5098
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>            Reporter: Yesha Vora
>            Assignee: Jian He
>         Attachments: YARN-5098.1.patch
>
>
> Environment : HA cluster
> Yarn application logs for long running application could not be gathered because Nodemanager
failed to talk to HDFS with below error.
> {code}
> 2016-05-16 18:18:28,533 INFO  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:finishLogAggregation(555))
- Application just finished : application_1463170334122_0002
> 2016-05-16 18:18:28,545 WARN  ipc.Client (Client.java:run(705)) - Exception encountered
while connecting to the server :
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
token (HDFS_DELEGATION_TOKEN token 171 for hrt_qa) can't be found in cache
>         at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
>         at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:583)
>         at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:398)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:752)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:748)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719)
>         at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:747)
>         at org.apache.hadoop.ipc.Client$Connection.access$3100(Client.java:398)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1597)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1439)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1386)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:240)
>         at com.sun.proxy.$Proxy83.getServerDefaults(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:282)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy84.getServerDefaults(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:1018)
>         at org.apache.hadoop.fs.Hdfs.getServerDefaults(Hdfs.java:156)
>         at org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:550)
>         at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:687)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message