hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3825) MR should not be getting duplicate tokens for a MR Job.
Date Thu, 16 Feb 2012 23:18:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209875#comment-13209875

Sanjay Radia commented on MAPREDUCE-3825:

Solution 4
* Has some elements from solutions 1, 2 and 3
* Distinguishes between leaf FSs and FSs that embed other FSs
** getCanonicaServiceName() and getDelegationToken() both return null for embedded FSs - don't
like this but seems unavoidable.
* FileSystem#getChidrenFSs() - can be used to get embedded FSs - note it only returns first
level of children - this is more general and can have other uses down the road.
* addDelegationTokens(renewer, credentials) has a default impl that will work for all embedded
** Perhaps this should be in FileUtil rather than FileSystem.

URI[] FileSystem#getChidrenFSs() { // return first level of children
  return emptyList; // default - leaf file system return null 
URI[] ViewFileSystem#getChidrenFileSystems() {
  return the mount points but don't recurse through.

String FileSystem#getCanonicaServiceName - no change except ViewFileSystem return null;

Token getDelegationToken() - no change except ViewFileSystem returns null

// Credentials is a map<serviceName, Token>
void FileSystem#addDelegationTokens(renewer, credentials) {
// this is new method - the old getDTs() is not needed.
// Provide a default impl here - viewfs does not override it.
 - Walk the tree using getChildredFSs and collect all the leafs,
     - if a fs return null then you know it is leaf.
 - eliminate dups
 - add missing tokens

// A Useful Utility - so that the TokenCache in MR can be easily implemented
FileUtil:GetTokens(renewer, path[] ps, credentials) {
  foreach (p in ps) {
    GetFileSystem(p).addDelegationTokens(renwer, credentials);

> MR should not be getting duplicate tokens for a MR Job.
> -------------------------------------------------------
>                 Key: MAPREDUCE-3825
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: MAPREDUCE-3825.patch, TokenCache.pdf
> This is the counterpart to HADOOP-7967.  
> MR gets tokens for all input, output and the default filesystem when a MR job is submitted.

> The APIs in FileSystem make it challenging to avoid duplicate tokens when there are file
systems that have embedded
> filesystems.
> Here is the original description that Daryn wrote: 
> The token cache currently tries to assume a filesystem's token service key.  The assumption
generally worked while there was a one to one mapping of filesystem to token.  With the advent
of multi-token filesystems like viewfs, the token cache will try to use a service key (ie.
for viewfs) that will never exist (because it really gets the mounted fs tokens).
> The descriop

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message