hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5663) Add an interface to Input/Ouput Formats to obtain delegation tokens
Date Tue, 14 Jan 2014 18:38:05 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870997#comment-13870997

Siddharth Seth commented on MAPREDUCE-5663:

bq. DelegationTokens should be always requested by the client, security enabled or not, computing
the splits on the client or not.
I think the client requesting the required tokens is required (directly or indirectly). Whether
this is done independent of security is something I'm not too sure about - mainly from the
perspective of services not handling getToken requests correctly if security is diabled. The
JobClient currently doesn't do this, at least for HDFS.

bq. DelegationTokens fetching should be done regardless of the IF/OF implementation (take
the case of talking with Hbase or HCatalog, job working dir service).
The intent of adding this interface is to be able to fetch tokens irrespective of the IF/OF
- assuming the IF/OF implement the interface. For HBase / HCatalog sources which are outside
of the IF/OF for a MR job - I don't think we have the capability for fetching tokens, and
rely on the user providing them up front. That seems like a reasonable approach for now. Alternately,
we could add a config specifying a list of classes which implement this interface - and can
be invoked by the client code.

bq. DelegationTokens fetching should not be tied to split computation.
Completely agree with this. I don't think we can do this though - without making an incompatible
change. We could explicitly fetch Credentials (if the interface is implemented), but at least
some existing IF/OFs will continue to rely on getSplits / checkOutputSpecs for tokens.

bq. We could have a utility class that we pass a UGI, list of service URIs and returns a populated
Credentials with tokens for all the specified services. The IF/OF/Job would have to be able
to extract the required URIs for the job.
Would this utility class know how to handle all kinds of URIs ? I think it's better to leave
the implementation of the Credentials Fetching code to the specific system (MR / HBase / HCatalog).
Configure a list of CredentialProviders - which know how to fetch Credentials for the specific

> Add an interface to Input/Ouput Formats to obtain delegation tokens
> -------------------------------------------------------------------
>                 Key: MAPREDUCE-5663
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5663
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Michael Weng
>         Attachments: MAPREDUCE-5663.4.txt, MAPREDUCE-5663.5.txt, MAPREDUCE-5663.6.txt,
MAPREDUCE-5663.patch.txt, MAPREDUCE-5663.patch.txt2, MAPREDUCE-5663.patch.txt3
> Currently, delegation tokens are obtained as part of the getSplits / checkOutputSpecs
calls to the InputFormat / OutputFormat respectively.
> This works as long as the splits are generated on a node with kerberos credentials. For
split generation elsewhere (AM for example), an explicit interface is required.

This message was sent by Atlassian JIRA

View raw message