hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5322) HDFS delegation token not found in cache errors seen on secure HA clusters
Date Mon, 14 Oct 2013 17:19:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794288#comment-13794288
] 

Jing Zhao commented on HDFS-5322:
---------------------------------

bq. Isn't it wrong for the NN to claim it's active/writable when it's not? It seems like another
state is needed to indicate a transition is in progress - and that state indicates the namespace
isn't writable.

Agree. I think the current code wants to achieve this through the FSNamesystem R/W lock: the
startActiveService method holds the write lock and blocks other methods. Then since we remove
the FSNamesystem lock in retrievePassword, the original implementation does not work for delegation
token part. We should file a separate jira to track this.

> HDFS delegation token not found in cache errors seen on secure HA clusters
> --------------------------------------------------------------------------
>
>                 Key: HDFS-5322
>                 URL: https://issues.apache.org/jira/browse/HDFS-5322
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.1.1-beta
>            Reporter: Arpit Gupta
>            Assignee: Jing Zhao
>             Fix For: 2.2.1
>
>         Attachments: HDFS-5322.000.patch, HDFS-5322.000.patch, HDFS-5322.001.patch, HDFS-5322.002.patch,
HDFS-5322.003.patch, HDFS-5322.004.patch, HDFS-5322.005.patch, HDFS-5322.006.patch
>
>
> While running HA tests we have seen issues were we see HDFS delegation token not found
in cache errors causing jobs running to fail.
> {code}
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> |2013-10-06 20:14:51,193 INFO  [main] mapreduce.Job: Task Id : attempt_1381090351344_0001_m_000007_0,
Status : FAILED
> Error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
token (HDFS_DELEGATION_TOKEN token 11 for hrt_qa) can't be found in cache
> at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message