hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException
Date Mon, 23 Jun 2014 14:21:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040790#comment-14040790
] 

Daryn Sharp commented on HDFS-6475:
-----------------------------------

bq. Your earlier suggestion indicated that we should use SecretManager#retriableRetrievePassword
instead of SecretManager#retrievePassword, does that mean client code has to be modified?

If I understand the question: The methods are only used server-side so no client-side changes
should be required, so no incompatibility concerns.

Did you happen to trace how/where the {{StandbyException}} is wrapped in an {{InvalidToken}}?
It looks like {{DelegationTokenSecretManager#retrievePassword}} is the only place it occurs,
but {{DelegationTokenSecretManager#retriableRetrievePassword}} does not wrap exceptions in
{{InvalidToken}}.

Is this maybe just a test case issue?  Which testcase is failing?

> WebHdfs clients fail without retry because incorrect handling of StandbyException
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-6475
>                 URL: https://issues.apache.org/jira/browse/HDFS-6475
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, webhdfs
>    Affects Versions: 2.4.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, HDFS-6475.003.patch, HDFS-6475.003.patch,
HDFS-6475.004.patch, HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, HDFS-6475.008.patch,
HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is previously
initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map returned by
DFSUtil.getNNServiceRpcAddresses(conf). And the client contact the NN based on the order,
so likely the first one it runs into is StandbyNN. If the StandbyNN doesn't have the updated
client crediential, it will throw a s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient handling of SecurityException
mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken:
StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to obtain
user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
>         at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
>         at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
>         at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
>         at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
>         at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
>         at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
>         at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
>         at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
>         at kclient1.kclient$1.run(kclient.java:64)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:356)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
>         at kclient1.kclient.main(kclient.java:58)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message