hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4148) MapReduce should not have a compile-time dependency on HDFS
Date Wed, 25 Apr 2012 21:36:18 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262139#comment-13262139
] 

Daryn Sharp commented on MAPREDUCE-4148:
----------------------------------------

Very nice!  I've been wanting to do something similar to allow tokens to decode their identifiers
for quite awhile now.  Thoughts/suggestions:

{{Token#decodeIdentifier()}} would be _really_ useful and prevent callers from having to know
about the factory.  This would very nearly abstract away the details.  It could either delegate
to the factory as-is, or just query the factory for the class, or maybe even {{Token}} can
host the kind/class registration method.  There are a number of other places in the code,
ex. {{AbstractDelegationTokenSecretManager#renewToken(Token)}} that could benefit from such
a method.

I'm not too fond of {{AbstractDelegationTokenIdentifier#stringifyToken(Token)}}.  It creates
a circular relationship.  A {{TokenIdentifier}} really shouldn't have to know about a {{Token}}
wrapper.  How removing the inversion of knowledge, and update {{Token#toString()}} to use
{{token.decodeIdentifier()}}?  If the value is null because the class isn't available, it
can print the raw bytes like it does now.

{{TokenIdentifierFactory.createIdentifier}} is using {{ReflectionUtils.newInstance(Configuration)}}
whose main purpose is to invoke {{setConf(conf)}} and/or MR's {{configure(conf)}} which isn't
applicable in this case.  How about directly using {{class.newInstance()}}?

I think there's pitfalls with using static class inits for the factory registration.  The
identifier class has to be loaded (sorry if I'm stating the obvious: not just imported, but
something referenced from it) before a token can be decoded.  In general this probably means
a token has to be created before other tokens can be decoded.  Perhaps the static blocks should
become a static class method that's invoked by the secret manager, since we know the secret
manager is instantiated before token manipulation.  Although, that limits token ident decoding
only to tokens owned by the daemon, which would leave a client out of luck.  Maybe you could
get fancy with a class loader.
                
> MapReduce should not have a compile-time dependency on HDFS
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-4148
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4148
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: MAPREDUCE-4148.patch, MAPREDUCE-4148.patch
>
>
> MapReduce depends on HDFS's DelegationTokenIdentifier (for printing token debug information).
We should remove this dependency and MapReduce's compile-time dependency on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message