hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
Date Fri, 10 Feb 2012 15:58:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205518#comment-13205518

Daryn Sharp commented on MAPREDUCE-3825:

I'm open to alternatives, but performing the elimination of dups is actually pretty simple:
  static void obtainTokensForNamenodesInternal(Credentials credentials,
       Path[] ps, Configuration conf) throws IOException {
--- start new code ---
    // use 2 passes to avoid redundant calls to the same filesystems
    // start by getting unique set of filesystems for all paths
    Set<FileSystem> pathFsSet = new HashSet<FileSystem>();
    for (Path p : ps) {
    // get the unique set of leaf filesystems
    Set<FileSystem> tokenFsSet = new HashSet<FileSystem>();
    for (FileSystem fs : pathFsSet) {
--- end new code ---
    // get all the tokens from the now flattened list of leaf filesystems
    for (FileSystem fs : tokenFsSet) {
      obtainTokensForNamenodesPrivate(fs, credentials, conf);

If many files are in the same filesystem, then a lot of necessary processing occurs, esp.
in the case of viewfs.

I may be misunderstanding this variation, but the acquisition of tokens via recursive calls
will require more changes that may break non-hadoop distributed filesystems.  I think it will
require code duplication of the default {{getDelegationTokens(renewer, creds)}}, or a new
api that overrides of this method can use to avoid getting dups.  The proposed default implementation
of {{FileSystem#getDelegations(renewer, creds)}} simply iterates {{this.getFileSystems()}}
too.  I'll write something up and then we can discuss a little more.
> Need generalized multi-token filesystem support
> -----------------------------------------------
>                 Key: MAPREDUCE-3825
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: MAPREDUCE-3825.patch
> This is the counterpart to HADOOP-7967.  The token cache currently tries to assume a
filesystem's token service key.  The assumption generally worked while there was a one to
one mapping of filesystem to token.  With the advent of multi-token filesystems like viewfs,
the token cache will try to use a service key (ie. for viewfs) that will never exist (because
it really gets the mounted fs tokens).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message