hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6222) Remove background token renewer from webhdfs
Date Wed, 11 Jun 2014 17:41:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028080#comment-14028080
] 

Daryn Sharp commented on HDFS-6222:
-----------------------------------

Good questions.  This design is the result of problems encountered by converting a mission
critical production system to use webhdfs.  We've been internally running in production for
months with this change on 0.23, and a sandbox 2.x grid.  A few of the issues: The renewer
is hardcoded to assume 24h, which isn't a guarantee by any means.  The filesystem can go dead
for up to a day.  Decreasing the token renewal on our QA clusters to 30s to stress token handling
obviously didn't work either...  We've also encountered class loader leaks.   Filesystems
would become unusable if the token expired, erroneously cancelled, or transient renewal failures
such as during a NN restart.

# A secure client is supposed to be able to talk to an insecure server which is why earlier
logic had this same behavior.  Regarding malformed responses, NPEs used to be generated, not
null returns.  My earlier work trapped and converted the NPEs, which in this case will trigger
the retry loop.  Unlike the current implementation, the fs will attempt to re-acquire a token
even after one operation fails which prevents the fs from becoming unusable - 
# Yes, very unfortunate, but I only did it for backwards compatibility with NNs, and also
cross-compatibility with DNs that don't munge the token exception.  I checked earlier versions
and it appears to have always been this way.
# True.  We've become very performance conscience, but token renewal is infrequent if ever
required by non-daemons so I consider the tiny latency worth the robustness.
# TokenAspect is still used by hftp or I would have happily removed it...
# I think this was covered by other tests.  I'll double check and add if necessary.  I'm not
sure how to test swebhdfs since it requires extra configuration and ssl certs to function...

> Remove background token renewer from webhdfs
> --------------------------------------------
>
>                 Key: HDFS-6222
>                 URL: https://issues.apache.org/jira/browse/HDFS-6222
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-6222.branch-2.patch, HDFS-6222.branch-2.patch, HDFS-6222.trunk.patch,
HDFS-6222.trunk.patch
>
>
> The background token renewer is a source of problems for long-running daemons.  Webhdfs
should lazy fetch a new token when it receives an InvalidToken exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message