hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5663) Add an interface to Input/Ouput Formats to obtain delegation tokens
Date Tue, 14 Jan 2014 05:07:58 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870396#comment-13870396
] 

Siddharth Seth commented on MAPREDUCE-5663:
-------------------------------------------

That's two sets of tokens that are obtained - for the working directory, and for any additional
HDFS servers which the user may have configured.
In addition to this, tokens may be obtained by Input/OutputFormats

>From FileInputFormat
{code}
Path[] dirs = getInputPaths(job);
    if (dirs.length == 0) {
      throw new IOException("No input paths specified in job");
    }
    
    // get tokens for all the required FileSystems..
    TokenCache.obtainTokensForNamenodes(job.getCredentials(), dirs, 
                                        job.getConfiguration());
{code}
getInputPaths reads the property "mapreduce.input.fileinputformat.inputdir" - which is specific
to FIF. If the input paths reside on a different Namenode than the one on which the staging
directory is, I don't think users must set MRJobConfig.JOB_NAMENODES. The tokens would just
be picked up as part of client side split generation.

In terms of Oozie, from what I understand, the JobSubmitter does not get invoked on a box
with kerberos credentials - not for the main job anyway (maybe for the launcher) - so this
code to obtain tokens doesn't kick in. If that's the case, my guess is Oozie has additional
configuration, and explicitly goes out and fetches tokens before submitting the launcher.

> Add an interface to Input/Ouput Formats to obtain delegation tokens
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5663
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5663
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Michael Weng
>         Attachments: MAPREDUCE-5663.4.txt, MAPREDUCE-5663.5.txt, MAPREDUCE-5663.6.txt,
MAPREDUCE-5663.patch.txt, MAPREDUCE-5663.patch.txt2, MAPREDUCE-5663.patch.txt3
>
>
> Currently, delegation tokens are obtained as part of the getSplits / checkOutputSpecs
calls to the InputFormat / OutputFormat respectively.
> This works as long as the splits are generated on a node with kerberos credentials. For
split generation elsewhere (AM for example), an explicit interface is required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message