hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3825) MR should not be getting duplicate tokens for a MR Job.
Date Mon, 13 Feb 2012 21:40:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207237#comment-13207237

Daryn Sharp commented on MAPREDUCE-3825:

Per an offline request by Sanjay, here's a summary of the proposed changes that henceforth
shall be referred to as "solution 3".

Required {{FileSystem}} APIs:
* {{getFileSystems()}} - proposed new api
** returns leaf filesystems, returns this filesystem by default
* {{getCanonicalServiceName()}} - existing api, no change; should consider reducing visibility
** returns the service of this filesystem's token, or null if no intrinsic token
* {{getDelegationToken(renewer)}} - existing api, no change; should consider reducing visibility
** returns this filesystem's token, {{token.getService()}} must match {{getCanonicalServiceName()}}
* {{getDelegationTokens(renewer, credentials)}} - existing api, new to 23; should be public
api to acquire tokens
** returns tokens not already acquired for this filesystem
** propose adding new tokens to supplied creds
* {{getDelegationTokens(renewer)}} - existing api, new to 23; proposed convenience method
** returns all tokens for the filesystem

Changes to:
# {{FilterFileSystem}}
#* Add:{code}
  String getCanonicalServiceName() {
    return null;
  public List<FileSystem> getFileSystems() {
    return fs.getFileSystems();
# {{DistributedFileSystem}}
#* Delete {{getDelegationTokens(renewer)}}
# {{ViewFileSystem}}
#* Delete {{getDelegationTokens(renewer)}} and {{getDelegationTokens(renewer, creds)}}
#* Add:{code}
  String getCanonicalServiceName() {
    return null;
  List<FileSystem> getFileSystems() {
    List<InodeTree.MountPoint<FileSystem>> mountPoints = fsState.getMountPoints();
    Set<FileSystem> fsSet = new HashSet<FileSystem>();
    for (InodeTree.MountPoint<FileSystem> mountPoint : mountPoints) {
      FileSystem targetFs = mountPoint.target.targetFileSystem;
    return new ArrayList<FileSystem>(fsSet);
# {{FileSystem}}
#* Add:{code}
  List<FileSystem> getFileSystems() {
    List<FileSystem> list = new ArrayList<FileSystem>(1);
    return list;
#* Change:{code}
  public final List<Token<?>> getDelegationTokens(String renewer, Credentials
credentials) throws IOException {
    List<Token<?>> newTokens = new ArrayList<Token<?>>();
    // there shouldn't be dups, but use a set just to be safe
    Set<FileSystem> fsLeafs = new HashSet<FileSystem>(getFileSystems());
    for (FileSystem fs : fsLeafs) {
      String serviceString = fs.getCanonicalServiceName();
      if (serviceString != null) { // null service = no tokens
        Text service = new Text(serviceString);
        Token<?> token = credentials.getToken(service);
        if (token == null) { // we don't have the token, so get it
          token = fs.getDelegationToken(renewer);
          if (token != null) { // add to the return list and to the creds
            credentials.addToken(service, token);
    return newTokens;

  // just a convenience method, it's not strictly required.
  public final List<Token<?>> getDelegationTokens(String renewer) throws IOException
    return getDelegationTokens(renewer, new Credentials());
# {{TokenCache}}
#* Change: (note this is a big simplification){code}
  static void obtainTokensForNamenodesInternal(FileSystem fs, 
      Credentials credentials, Configuration conf) throws IOException {
    String delegTokenRenewer = Master.getMasterPrincipal(conf);
    if (delegTokenRenewer == null || delegTokenRenewer.length() == 0) {
      throw new IOException(
          "Can't get Master Kerberos principal for use as renewer");
    mergeBinaryTokens(credentials, conf);
    List<Token<?>> tokens = fs.getDelegationTokens(delegTokenRenewer, credentials);
    if (tokens != null) {
      for (Token<?> token : tokens) {
        LOG.info("Got dt for " + fs.getUri() + 

> MR should not be getting duplicate tokens for a MR Job.
> -------------------------------------------------------
>                 Key: MAPREDUCE-3825
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: MAPREDUCE-3825.patch, TokenCache.pdf
> This is the counterpart to HADOOP-7967.  
> MR gets tokens for all input, output and the default filesystem when a MR job is submitted.

> The APIs in FileSystem make it challenging to avoid duplicate tokens when there are file
systems that have embedded
> filesystems.
> Here is the original description that Daryn wrote: 
> The token cache currently tries to assume a filesystem's token service key.  The assumption
generally worked while there was a one to one mapping of filesystem to token.  With the advent
of multi-token filesystems like viewfs, the token cache will try to use a service key (ie.
for viewfs) that will never exist (because it really gets the mounted fs tokens).
> The descriop

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message