hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3825) MR should not be getting duplicate tokens for a MR Job.
Date Thu, 16 Feb 2012 23:50:01 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209896#comment-13209896

Daryn Sharp commented on MAPREDUCE-3825:

Upon first read, I think solution 4 seems reasonable.

I think I like the distinction of fs children.  I used a recursive implementation in part
due to the simplicity.  One option for increased flexibility might be to have both {{getChildFileSystems}}
and {{getFileSystems}}.  {{getFileSystems}} would be implemented via {{getChildFileSystems}},
but offers more flexibility -- one example is when viewfs needs to spray an operation to its
mounts, like {{setVerifyChecksum}}, it would loop over {{getFileSystems}} instead of having
to do the same leaf collection and uniquing that {{addDelegationTokens}} has to do.

I would suggest that we retain {{getDelegationTokens(renewer, creds)}}.  The only difference
between it and {{addDelegations(renewer, creds)}} would be it returns a list of the new tokens
it had to acquire.  In a void context, add & get would be identical in behavior.  That
allows the caller the opportunity to process the new tokens in some way.  One simple example
would be logging the new tokens.
> MR should not be getting duplicate tokens for a MR Job.
> -------------------------------------------------------
>                 Key: MAPREDUCE-3825
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: MAPREDUCE-3825.patch, TokenCache.pdf
> This is the counterpart to HADOOP-7967.  
> MR gets tokens for all input, output and the default filesystem when a MR job is submitted.

> The APIs in FileSystem make it challenging to avoid duplicate tokens when there are file
systems that have embedded
> filesystems.
> Here is the original description that Daryn wrote: 
> The token cache currently tries to assume a filesystem's token service key.  The assumption
generally worked while there was a one to one mapping of filesystem to token.  With the advent
of multi-token filesystems like viewfs, the token cache will try to use a service key (ie.
for viewfs) that will never exist (because it really gets the mounted fs tokens).
> The descriop

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message